| FPTbinary {DPpackage} | R Documentation |
This function generates a posterior density sample for a binary regression model using a Finite Polya tree prior for the link function.
FPTbinary(formula,baseline="logistic",prior,mcmc,state,status,misc=NULL,
data=sys.frame(sys.parent()),na.action=na.fail)
formula |
a two-sided linear formula object describing the
model fit, with the response on the
left of a ~ operator and the terms, separated by +
operators, on the right. |
baseline |
a description of the baseline error distribution to
be used in the model. The baseline distributions considered by
FPTbinary so far is logistic. |
prior |
a list giving the prior information. The list includes the following
parameters: a0 and b0 giving the hyperparameters for
prior distribution of the precision parameter of the Finite Polya
Tree prior, alpha giving the value of the precision parameter (it
must be specified if a0 and b0 are missing),
beta0 and Sbeta0 giving the
hyperparameters of the normal prior distribution for the regression
coefficients, M giving the finite level
to be considered for the Finite Polya tree. |
mcmc |
a list giving the MCMC parameters. The list must include
the following integers: nburn giving the number of burn-in
scans, nskip giving the thinning interval, nsave giving
the total number of scans to be saved, ndisplay giving
the number of saved scans to be displayed on the screen (the function reports
on the screen when every ndisplay iterations have been carried
out), and tune1 and tune2 giving the Metropolis tuning
parameters for the regression coefficients and precision parameter,
respectively (the default value is 1.1). |
state |
a list giving the current value of the parameters. This list is used if the current analysis is the continuation of a previous analysis. |
status |
a logical variable indicating whether this run is new (TRUE) or the
continuation of a previous analysis (FALSE). In the latter case
the current value of the parameters must be specified in the
object state. |
misc |
misclassification information. When used, this list must include
two objects, sens and spec, giving the sensitivity and
specificity, respectively. Both can be a vector or a scalar.
This information is used to correct for misclassification in the
conditional bernoulli model. |
data |
data frame. |
na.action |
a function that indicates what should happen when the data
contain NAs. The default action (na.fail) causes
FPTbinary to print an error message and terminate if there are any
incomplete observations. |
This generic function fits a semiparametric binary regression model using a Finite Polya tree prior (FPT) for the link function (see, Hanson, 2006; Jara, Garcia-Zattera and Lesaffre, 2006):
yi = I(Vi <= Xi β), i=1,...,n
V1,...,Vn | G ~ G
G | alpha ~ FPT^M(Pi,A)
where, the FPT is centered around a Logistic(0,1) distribution if the baseline is logistic, by
taking each m level of the partition Pi to coincide
with the k/2^m, k=0,...,2^m quantile of the Logistic(0,1) distribution.
The family A={alphae: e in E^{*}},
where E^{*}=bigcup_{m=1}^{M} E^m
and E^m is the m-fold product of E={0,1},
is specified as alpha{e1 ... em}=α m^2.
To complete the model specification, independent hyperpriors are assumed,
alpha | a0, b0 ~ Gamma(a0,b0)
β | beta0, Sbeta0 ~ N(beta0,Sbeta0)
The precision parameter, alpha, of the FPT prior
can be considered as random, having a gamma distribution, Gamma(a0,b0),
or fixed at some particular value. To let alpha to be fixed at a particular
value, set a0 to NULL in the prior specification.
In the computational implementation of the model, Metropolis-Hastings steps are used to sample the posterior distribution of the regression coefficients and the precision parameter, as described in Hanson (2006), and Jara, Garcia-Zattera and Lesaffre (2006).
An object of class FPTbinary representing the semiparametric logistic regression
model fit. Generic functions such as print, plot, predict, summary,
and anova have methods to show the results of the fit.
The results include beta, the precision parameter (alpha), and the link function.
The MCMC samples of the parameters and the errors in the model are stored in the object
thetasave and randsave, respectively. Both objects are included in the
list save.state and are matrices which can be analyzed directly by functions
provided by the coda package.
The list state in the output object contains the current value of the parameters
necessary to restart the analysis. If you want to specify different starting values
to run multiple chains set status=TRUE and create the list state based on
this starting values. In this case the list state must include the following objects:
beta |
giving the value of the regression coefficients. |
v |
giving the value of the errors (it must be consistent with yi = I(Vi < xi beta). |
y |
giving the value of the true response binary variable (only if the model considers correction for misclassification). |
alpha |
giving the value of the precision parameter. |
Alejandro Jara <Alejandro.JaraVallejos@med.kuleuven.be>
Tim Hanson <hanson@biostat.umn.edu>
Hanson, T. (2006) Inference for Mixtures of Finite Polya tree models. Journal of the American Statistical Association, 101: 1548-1565.
Jara, A., Garcia-Zattera, M.J., Lesaffre, E. (2006) Semiparametric Bayesian Analysis of Misclassified Binary Data. In preparation.
Lavine, M. (1992) Some aspects of Polya tree distributions for statistical modelling. The Annals of Statistics, 20: 1222-11235.
Lavine, M. (1994) More aspects of Polya tree distributions for statistical modelling. The Annals of Statistics, 22: 1161-1176.
## Not run:
# Prostate cancer data example
data(nodal)
attach(nodal)
lacid<-log(acid)
# Initial state
state <- NULL
# MCMC parameters
nburn<-20000
nsave<-10000
nskip<-10
ndisplay<-100
mcmc <- list(nburn=nburn,nsave=nsave,nskip=nskip,ndisplay=ndisplay,
tune1=1.1,tune2=1.1)
# Prior distribution
prior <- list(alpha=1, beta0=c(0,rep(0.75,5)),
Sbeta0=diag(c(100,rep(25,5)),6),M=5)
# Fitting the Finite Polya tree model
fit1 <-FPTbinary(ssln~age+lacid+xray+size+grade,prior=prior,mcmc=mcmc,
state=state,status=TRUE)
fit1
# Summary with HPD and Credibility intervals
summary(fit1)
summary(fit1,hpd=FALSE)
# Plot model parameters (to see the plots gradually set ask=TRUE)
plot(fit1)
plot(fit1,nfigr=2,nfigc=2)
# Plot an specific model parameter (to see the plots gradually
# set ask=TRUE)
plot(fit1,ask=FALSE,nfigr=1,nfigc=2,param="xray")
plot(fit1,ask=FALSE,param="link",nfigc=1,nfigr=1)
# Table of Pseudo Contour Probabilities
anova(fit1)
# Fitting parametric models
nburn<-20000
nsave<-10000
nskip<-10
ndisplay<-100
mcmc <- list(nburn=nburn,nsave=nsave,nskip=nskip,ndisplay=ndisplay,
tune=1.1)
fit2<-Pbinary(ssln~age+lacid+xray+size+grade,link="probit",
prior=prior,mcmc=mcmc,state=state,status=TRUE)
fit3<-Pbinary(ssln~age+lacid+xray+size+grade,link="logit",
prior=prior,mcmc=mcmc,state=state,status=TRUE)
# Model comparison
DPpsBF(fit1,fit2,fit3)
## End(Not run)