calibrate               package:Design               R Documentation

_R_e_s_a_m_p_l_i_n_g _M_o_d_e_l _C_a_l_i_b_r_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Uses bootstrapping or cross-validation to get bias-corrected
     (overfitting- corrected) estimates of predicted vs. observed
     values based on subsetting predictions into intervals (for
     survival models) or on nonparametric smoothers (for other models).
     There are calibration functions for Cox ('cph'), parametric
     survival models ('psm'), binary and ordinal logistic models
     ('lrm') and ordinary least squares ('ols'). For survival models,
     "predicted" means predicted survival probability at a single time
     point, and "observed" refers to the corresponding Kaplan-Meier 
     survival estimate, stratifying on intervals of predicted survival.
     For logistic and linear models, a nonparametric calibration curve
     is estimated over a sequence of predicted values. The fit must
     have specified 'x=TRUE, y=TRUE'.  The 'print' and 'plot' methods
     for 'lrm' and 'ols' models (which use 'calibrate.default') print
     the mean absolute error in predictions, the mean squared error,
     and the 0.9 quantile of the absolute error.  Here, error refers to
     the difference between the predicted values and the corresponding
     bias-corrected calibrated values.

     Below, the second, third, and fourth invocations of 'calibrate'
     are, respectively, for 'ols' and 'lrm', 'cph', and 'psm'.  The
     first and second 'plot' invocation are respectively for 'lrm' and
     'ols' fits or all other fits.

_U_s_a_g_e:

     calibrate(fit, ...)
     ## Default S3 method:
     calibrate(fit, predy, 
       method=c("boot","crossvalidation",".632","randomization"),
       B=40, bw=FALSE, rule=c("aic","p"),
       type=c("residual","individual"),
       sls=.05, pr=FALSE, kint, smoother="lowess", ...)
     ## S3 method for class 'cph':
     calibrate(fit, method="boot", u, m=150, cuts, B=40, 
       bw=FALSE, rule="aic", type="residual", sls=0.05, aics=0, 
       pr=FALSE, what="observed-predicted", tol=1e-12, ...)
     ## S3 method for class 'psm':
     calibrate(fit, method="boot", u, m=150, cuts, B=40,
       bw=FALSE,rule="aic",
       type="residual",sls=.05,aics=0,
       pr=FALSE,what="observed-predicted",tol=1e-12, maxiter=15, 
       rel.tolerance=1e-5, ...)

     ## S3 method for class 'calibrate':
     print(x, ...)
     ## S3 method for class 'calibrate.default':
     print(x, ...)

     ## S3 method for class 'calibrate':
     plot(x, xlab, ylab, subtitles=TRUE, conf.int=TRUE,
     ...)

     ## S3 method for class 'calibrate.default':
     plot(x, xlab, ylab, xlim, ylim,
       legend=TRUE, subtitles=TRUE, ...)

_A_r_g_u_m_e_n_t_s:

     fit: a fit from 'ols', 'lrm', 'cph' or 'psm' 

       x: an object created by 'calibrate'

  method: 

       B: 

      bw: 

    rule: 

    type: 

     sls: 

    aics: see 'validate'

       u: the time point for which to validate predictions for survival
          models. For 'cph' fits, you must have specified 'surv=TRUE,
          time.inc=u', where 'u' is the constant specifying the time to
          predict. 

       m: group predicted 'u'-time units survival into intervals
          containing 'm' subjects on the average (for survival models
          only) 

    cuts: actual cut points for predicted survival probabilities. You
          may specify only one of 'm' and 'cuts' (for survival models
          only) 

      pr: set to 'TRUE' to print intermediate results for each
          re-sample 

    what: The default is '"observed-predicted"', meaning to estimate
          optimism in this difference. This is preferred as it accounts
          for skewed distributions of predicted probabilities in outer
          intervals. You can also specify '"observed"'.  This argument
          applies to survival models only. 

     tol: criterion for matrix singularity (default is '1e-12')

 maxiter: for 'psm', this is passed to 'survreg.control' (default is 15
          iterations) 

rel.tolerance: parameter passed to 'survreg.control' for 'psm' (default
          is 1e-5). 

   predy: a scalar or vector of predicted values to calibrate (for
          'lrm', 'ols').  Default is 50 equally spaced points between
          the 5th smallest and the 5th largest  predicted values.  For
          'lrm' the predicted values are probabilities (see 'kint'). 

    kint: For an ordinal logistic model the default predicted
          probability that Y>=q the middle level.  Specify 'kint' to
          specify the intercept to use, e.g., 'kint=2' means to
          calibrate Prob(Y>=q b), where b is the second level of Y. 

smoother: a function in two variables which produces x- and
          y-coordinates by smoothing the input 'y'.  The default is to
          use 'lowess(x, y, iter=0)'.  

     ...: other arguments to pass to 'predab.resample', such as
          'group', 'cluster', and 'subset'. Also, other arguments for
          'plot'. 

    xlab: defaults to "Predicted x-units Survival" or to a suitable
          label for other models 

    ylab: defaults to "Fraction Surviving x-units" or to a suitable
          label for other models 

    xlim: 

    ylim: 2-vectors specifying x- and y-axis limits, if not using
          defaults

subtitles: set to 'FALSE' to suppress subtitles in plot describing
          method and for 'lrm' and 'ols' the mean absolute error and
          original sample size 

conf.int: set to 'FALSE' to suppress plotting 0.95 confidence intervals
          for Kaplan-Meier estimates 

  legend: set to 'FALSE' to suppress legends (for 'lrm', 'ols' only) on
          the calibration plot, or specify a list with elements 'x' and
          'y' containing the coordinates of the upper left corner of
          the legend.  By default, a legend will be drawn in the lower
          right 1/16th of the plot. 

_D_e_t_a_i_l_s:

     If the fit was created using penalized maximum likelihood
     estimation, the same 'penalty' and 'penalty.scale' parameters are
     used during validation.

_V_a_l_u_e:

     matrix specifying mean predicted survival in each interval, the
     corresponding estimated bias-corrected Kaplan-Meier estimates,
     number of subjects, and other statistics.  For linear and logistic
     models, the matrix instead has rows corresponding to the
     prediction points, and the vector of predicted values being
     validated is returned as an attribute. The returned object has
     class '"calibrate"' or '"calibrate.default"'.

_S_i_d_e _E_f_f_e_c_t_s:

     prints, and stores an object 'pred.obs' or '.orig.cal'

_A_u_t_h_o_r(_s):

     Frank Harrell
      Department of Biostatistics
      Vanderbilt University
      f.harrell@vanderbilt.edu

_S_e_e _A_l_s_o:

     'validate', 'predab.resample', 'groupkm', 'errbar', 'cph', 'psm',
     'lowess'

_E_x_a_m_p_l_e_s:

     set.seed(1)
     d.time <- rexp(200)
     x1 <- runif(200)
     x2 <- factor(sample(c('a','b','c'),200,TRUE))
     f <- cph(Surv(d.time) ~ pol(x1,2)*x2, x=TRUE, y=TRUE, surv=TRUE, time.inc=2)
     #or f <- psm(S ~ ...)
     cal <- calibrate(f, u=2, m=50, B=20)  # usually B=200 or 300
     plot(cal)

     y <- sample(0:2, 200, TRUE)
     x1 <- runif(200)
     x2 <- runif(200)
     x3 <- runif(200)
     x4 <- runif(200)
     f <- lrm(y ~ x1+x2+x3*x4, x=TRUE, y=TRUE)
     cal <- calibrate(f, kint=2, predy=seq(.2,.8,length=60), 
                      group=y)
     # group= does k-sample validation: make resamples have same 
     # numbers of subjects in each level of y as original sample

     plot(cal)
     #See the example for the validate function for a method of validating
     #continuation ratio ordinal logistic models.  You can do the same
     #thing for calibrate

