PTlm                package:DPpackage                R Documentation

_B_a_y_e_s_i_a_n _a_n_a_l_y_s_i_s _f_o_r _a _s_e_m_i_p_a_r_a_m_e_t_r_i_c _l_i_n_e_a_r _r_e_g_r_e_s_s_i_o_n _m_o_d_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     This function generates a posterior density sample from a
     semiparametric linear regression model using a  Mixture of Polya
     Trees prior for the distribution of the errors.

_U_s_a_g_e:

     PTlm(formula,ngrid=200,grid=NULL,prior,mcmc,state,status,
          data=sys.frame(sys.parent()),na.action=na.fail)

_A_r_g_u_m_e_n_t_s:

 formula: a two-sided linear formula object describing the model fit,
          with the response on the left of a '~' operator and the
          terms, separated by '+' operators, on the right.

   ngrid: number of grid points where the error density estimate is 
          evaluated. The default value is 200.

    grid: grid points where the density estimate is  evaluated. The
          default is NULL.

   prior: a list giving the prior information. The list includes the
          following parameter: 'a0' and 'b0' giving the hyperparameters
          for prior distribution of the precision parameter of the
          Polya Tree prior, 'alpha' giving the value of the precision
          parameter (it  must be specified if 'a0' and 'b0' are
          missing, see details below), 'tau1' and 'tau2' giving the 
          hyperparameters for the prior distribution of the variance of
          the normal baseline distribution, 'beta0' and 'Sbeta0' 
          giving the hyperparameters of the normal prior distribution
          for the regression coefficients, optionally 'M' giving the
          finite level to be considered, and 'frstlprob' a logical
          variable indicating whether the first level probabilities of
          the PT are fixed defining a median regression model (the
          default is TRUE). Note that  if 'M' is specified, a Partially
          Specified Mixture of Polya trees is fitted.

    mcmc: a list giving the MCMC parameters. The list must include the
          following integers: 'nburn' giving the number of burn-in 
          scans, 'nskip' giving the thinning interval, 'nsave' giving
          the total number of scans to be saved, and 'ndisplay' giving
          the number of saved scans to be displayed on the screen (the
          function reports  on the screen when every 'ndisplay'
          iterations have been carried out).

   state: a list giving the current value of the parameters. This list
          is used if the current analysis is the continuation of a
          previous analysis.

  status: a logical variable indicating whether this run is new
          ('TRUE') or the  continuation of a previous analysis
          ('FALSE'). In the latter case the current value of the
          parameters must be specified in the  object 'state'.

    data: data frame.

na.action: a function that indicates what should happen when the data
          contain 'NA's. The default action ('na.fail') causes  'PTlm'
          to print an error message and terminate if there are any
          incomplete observations.

_D_e_t_a_i_l_s:

     By default, this generic function fits a median regression model
     using a Scale Mixture of Polya Trees  prior for the distribution
     of the errors (see, e.g., Lavine, 1992 and 1994, Hanson and
     Johnson, 2004):


                     yi = Xi beta + Vi, i=1,...,n


                              Vi | G ~ G


             G | alpha,sigma2 ~ PT(Pi^{sigma2},textit{A})


     where, the PT is centered around a N(0,sigma2) distribution, by
     taking each m level of the partition Pi^{sigma2} to coincide  with
     the k/2^m, k=0,...,2^m quantile of the N(0,sigma2) distribution.
     The family textit{A}={alphae: e in E^{*}}, where
     E^{*}=bigcup_{m=1}^{+infty} E^m  and E^m is the m-fold product of
     E={0,1}, was specified as alpha{e1 ... em}=alpha m^2.  To complete
     the model specification, independent hyperpriors are assumed,

                    alpha | a0, b0 ~ Gamma(a0,b0)


             sigma^-2 | tau1, tau2 ~ Gamma(tau1/2,tau2/2)


     Optionally, if frstlprob=FALSE (the default value is TRUE) is
     specified, a mean regression model is considered. In this case,
     the  following PT prior is considered:

          G | alpha,mu,sigma2 ~ PT(Pi^{mu,sigma2},textit{A})

     where, the PT is centered around a N(mu,sigma2) distribution. In
     this case, the intercept term is automatically excluded from the
     model and the hyperparameters for the normal prior for mu must be
     specified. The normal prior is given by,


                       mu | mub, Sb ~ N(mub,Sb)


     The precision parameter, alpha, of the 'PT' prior  can be
     considered as random, having a 'gamma' distribution, Gamma(a0,b0),
      or fixed at some particular value. To let alpha to be fixed at a
     particular value, set a0 to NULL in the prior specification.

     In the computational implementation of the model,
     Metropolis-Hastings steps are used to  sample the posterior
     distribution of the regression coefficients and  hyperparameters.

_V_a_l_u_e:

     An object of class 'PTlm' representing the semiparametric median
     regression model fit. Generic functions such as 'print', 'plot',
     'summary', and 'anova' have methods to show the results of the
     fit.  The results include 'beta', 'mu', 'sigma2', and the
     precision parameter '\alpha'.

     The list 'state' in the output object contains the current value
     of the parameters  necessary to restart the analysis. If you want
     to specify different starting values  to run multiple chains set
     'status=TRUE' and create the list state based on  this starting
     values. In this case the list 'state' must include the following
     objects:

   alpha: giving the value of the precision parameter.

    beta: giving the value of the regression coefficients.

      mu: giving the mean of the normal baseline distribution (If
          needed).

  sigma2: giving the variance of the normal baseline distribution.

       v: giving the value of the errors (it must be consistent with
          the data.

_A_u_t_h_o_r(_s):

     Alejandro Jara Vallejos <Alejandro.JaraVallejos@med.kuleuven.be>

_R_e_f_e_r_e_n_c_e_s:

     Hanson, T., and Johnson, W. (2002) Modeling regression error with
     a Mixture of Polya Trees. Journal of the American Statistical
     Association,  97: 1020 - 1033.

     Lavine, M. (1992) Some aspects of Polya tree distributions for
     statistical  modelling. The Annals of Statistics, 20: 1222-11235.

     Lavine, M. (1994) More aspects of Polya tree distributions for
     statistical  modelling. The Annals of Statistics, 22: 1161-1176.

_E_x_a_m_p_l_e_s:

     ## Not run: 
         ####################################
         # A simulated Data Set
         # (Mixture of Normals)
         ####################################

           ind<-rbinom(100,1,0.5)
           vsim<-ind*rnorm(100,1,0.15)+(1-ind)*rnorm(100,3,0.15)

           x1<-rep(c(0,1),50)
           x2<-rnorm(100,0,1)

           etasim<-x1+-1*x2
           y<-etasim+vsim

         # Initial state
           state <- NULL

         # MCMC parameters
           nburn<-5000
           nsave<-10000
           nskip<-20
           ndisplay<-100
           mcmc <- list(nburn=nburn,nsave=nsave,nskip=nskip,
                        ndisplay=ndisplay)

         # Prior information
           prior <- list(alpha=1,beta0=rep(0,3),Sbeta0=diag(1000,3),
                         tau1=0.01,tau2=0.01,M=6)

         # Fit the model

           fit1 <- PTlm(formula=y~x1+x2,prior=prior,mcmc=mcmc,state=state,
                        status=TRUE) 

         # Summary with HPD and Credibility intervals
           summary(fit1)
           summary(fit1,hpd=FALSE)

         # Plot model parameters (to see the plots gradually set ask=TRUE)
           plot(fit1)
           plot(fit1,nfigr=2,nfigc=2)

         # Table of Pseudo Contour Probabilities
           anova(fit1)
           

         ############################################
         # The Australian Institute of Sport's data
         # (Skew data example)
         ############################################
           data(sports)
           attach(sports)

         # Initial state
           state <- NULL

         # MCMC parameters

           nburn<-5000
           nsave<-10000
           nskip<-20
           ndisplay<-100
           mcmc <- list(nburn=nburn,nsave=nsave,nskip=nskip,
                        ndisplay=ndisplay)

         # Prior information
           prior <- list(alpha=1,beta0=rep(0,3),Sbeta0=diag(1000,3),
                         tau1=0.01,tau2=0.01,M=8)

         # Fit the model

           fit2 <- PTlm(formula=bmi~lbm+gender,prior=prior,mcmc=mcmc,
                        state=state,status=TRUE) 

         # Summary with HPD and Credibility intervals
           summary(fit2)
           summary(fit2,hpd=FALSE)

         # Plot model parameters (to see the plots gradually set ask=TRUE)
           plot(fit2)
           plot(fit2,nfigr=2,nfigc=2)

         # Table of Pseudo Contour Probabilities
           anova(fit2)

     ## End(Not run)      

