FPTbinary             package:DPpackage             R Documentation

_B_a_y_e_s_i_a_n _a_n_a_l_y_s_i_s _f_o_r _a _F_i_n_i_t_e _P_o_l_y_a _T_r_e_e _B_e_r_n_o_u_l_l_i _r_e_g_r_e_s_s_i_o_n _m_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     This function generates a posterior density sample for a binary
     regression model using a Finite Polya tree prior for the link
     function.

_U_s_a_g_e:

     FPTbinary(formula,baseline="logistic",prior,mcmc,state,status,misc=NULL,
               data=sys.frame(sys.parent()),na.action=na.fail)

_A_r_g_u_m_e_n_t_s:

 formula: a two-sided linear formula object describing the model fit,
          with the response on the left of a '~' operator and the
          terms, separated by '+' operators, on the right.

baseline: a description of the baseline error distribution to be used
          in the model. The baseline distributions considered by 
          'FPTbinary' so far is 'logistic'.

   prior: a list giving the prior information. The list includes the
          following parameters: 'a0' and 'b0' giving the
          hyperparameters for prior distribution of the precision
          parameter of the Finite Polya Tree prior, 'alpha' giving the
          value of the precision parameter (it  must be specified if
          'a0' and 'b0' are missing),  'beta0' and 'Sbeta0' giving the 
          hyperparameters of the normal prior distribution for the
          regression coefficients, 'M' giving the finite level to be
          considered for the Finite Polya tree.

    mcmc: a list giving the MCMC parameters. The list must include the
          following integers: 'nburn' giving the number of burn-in 
          scans, 'nskip' giving the thinning interval, 'nsave' giving
          the total number of scans to be saved, 'ndisplay' giving the
          number of saved scans to be displayed on the screen (the
          function reports  on the screen when every 'ndisplay'
          iterations have been carried out), and 'tune1' and 'tune2'
          giving the Metropolis tuning  parameters for the regression
          coefficients and precision parameter, respectively (the
          default value is 1.1).

   state: a list giving the current value of the parameters. This list
          is used if the current analysis is the continuation of a
          previous analysis.

  status: a logical variable indicating whether this run is new
          ('TRUE') or the  continuation of a previous analysis
          ('FALSE'). In the latter case the current value of the
          parameters must be specified in the  object 'state'.

    misc: misclassification information. When used, this list must
          include two objects, 'sens' and 'spec', giving the
          sensitivity and specificity, respectively. Both can be a
          vector or a scalar.  This information is used to correct for
          misclassification in the conditional bernoulli model.

    data: data frame.

na.action: a function that indicates what should happen when the data
          contain 'NA's. The default action ('na.fail') causes 
          'FPTbinary' to print an error message and terminate if there
          are any incomplete observations.

_D_e_t_a_i_l_s:

     This generic function fits a semiparametric binary regression
     model using a Finite Polya tree prior (FPT) for the link function 
     (see, Hanson, 2006; Jara, Garcia-Zattera and Lesaffre, 2006):

                   yi = I(Vi <= Xi beta), i=1,...,n


                          V1,...,Vn | G ~ G


                       G | alpha ~ FPT^M(Pi,A)


     where, the FPT is centered around a Logistic(0,1) distribution if
     the baseline is 'logistic', by taking each m level of the
     partition Pi to coincide  with the k/2^m, k=0,...,2^m quantile of
     the Logistic(0,1) distribution. The family A={alphae: e in E^{*}},
      where E^{*}=bigcup_{m=1}^{M} E^m  and E^m is the m-fold product
     of E={0,1}, is specified as alpha{e1 ... em}=alpha m^2.  To
     complete the model specification, independent hyperpriors are
     assumed,

                    alpha | a0, b0 ~ Gamma(a0,b0)


                beta | beta0, Sbeta0 ~ N(beta0,Sbeta0)


     The precision parameter, alpha, of the 'FPT' prior  can be
     considered as random, having a 'gamma' distribution, Gamma(a0,b0),
      or fixed at some particular value. To let alpha to be fixed at a
     particular value, set a0 to NULL in the prior specification.

     In the computational implementation of the model,
     Metropolis-Hastings steps are used to  sample the posterior
     distribution of the regression coefficients and the precision
     parameter, as described in Hanson (2006), and Jara, Garcia-Zattera
     and Lesaffre (2006).

_V_a_l_u_e:

     An object of class 'FPTbinary' representing the semiparametric
     logistic regression model fit. Generic functions such as 'print',
     'plot', 'predict', 'summary', and 'anova' have methods to show the
     results of the fit.  The results include 'beta', the precision
     parameter ('alpha'), and the 'link' function.

     The MCMC samples of the parameters and the errors in the model are
     stored in the object  'thetasave' and 'randsave', respectively.
     Both objects are included in the  list 'save.state' and are
     matrices which can be analyzed directly by functions  provided by
     the coda package.

     The list 'state' in the output object contains the current value
     of the parameters  necessary to restart the analysis. If you want
     to specify different starting values  to run multiple chains set
     'status=TRUE' and create the list state based on  this starting
     values. In this case the list 'state' must include the following
     objects:

    beta: giving the value of the regression coefficients.

       v: giving the value of the errors (it must be consistent with
          'yi = I(Vi < xi beta)'.

       y: giving the value of the true response binary variable (only
          if the model considers correction for misclassification).

   alpha: giving the value of the precision parameter.

_A_u_t_h_o_r(_s):

     Alejandro Jara <Alejandro.JaraVallejos@med.kuleuven.be>

     Tim Hanson <hanson@biostat.umn.edu>

_R_e_f_e_r_e_n_c_e_s:

     Hanson, T. (2006) Inference for Mixtures of Finite Polya tree
     models.  Journal of the American Statistical Association, 101:
     1548-1565. 

     Jara, A., Garcia-Zattera, M.J., Lesaffre, E. (2006) Semiparametric
     Bayesian Analysis of Misclassified Binary Data. In preparation.

     Lavine, M. (1992) Some aspects of Polya tree distributions for
     statistical  modelling. The Annals of Statistics, 20: 1222-11235.

     Lavine, M. (1994) More aspects of Polya tree distributions for
     statistical  modelling. The Annals of Statistics, 22: 1161-1176.

_E_x_a_m_p_l_e_s:

     ## Not run: 

         # Prostate cancer data example
           data(nodal)
           attach(nodal)
           lacid<-log(acid)

         # Initial state
           state <- NULL

         # MCMC parameters
           nburn<-20000
           nsave<-10000
           nskip<-10
           ndisplay<-100
           mcmc <- list(nburn=nburn,nsave=nsave,nskip=nskip,ndisplay=ndisplay,
                         tune1=1.1,tune2=1.1)

         # Prior distribution
           prior <- list(alpha=1, beta0=c(0,rep(0.75,5)),
                         Sbeta0=diag(c(100,rep(25,5)),6),M=5)

         # Fitting the Finite Polya tree model
           fit1 <-FPTbinary(ssln~age+lacid+xray+size+grade,prior=prior,mcmc=mcmc,
                            state=state,status=TRUE) 
           fit1

         # Summary with HPD and Credibility intervals
           summary(fit1)
           summary(fit1,hpd=FALSE)

         # Plot model parameters (to see the plots gradually set ask=TRUE)
           plot(fit1)
           plot(fit1,nfigr=2,nfigc=2)        

         # Plot an specific model parameter (to see the plots gradually 
         # set ask=TRUE)
           plot(fit1,ask=FALSE,nfigr=1,nfigc=2,param="xray") 
           plot(fit1,ask=FALSE,param="link",nfigc=1,nfigr=1)

         # Table of Pseudo Contour Probabilities
           anova(fit1)

         # Fitting parametric models

           nburn<-20000
           nsave<-10000
           nskip<-10
           ndisplay<-100
           mcmc <- list(nburn=nburn,nsave=nsave,nskip=nskip,ndisplay=ndisplay,
                         tune=1.1)

           fit2<-Pbinary(ssln~age+lacid+xray+size+grade,link="probit",
                         prior=prior,mcmc=mcmc,state=state,status=TRUE)  
                         
           fit3<-Pbinary(ssln~age+lacid+xray+size+grade,link="logit",
                         prior=prior,mcmc=mcmc,state=state,status=TRUE)  

         
         # Model comparison

           DPpsBF(fit1,fit2,fit3)
           
     ## End(Not run)      

