sensuc                package:Design                R Documentation

_S_e_n_s_i_t_i_v_i_t_y _t_o _U_n_m_e_a_s_u_r_e_d _C_o_v_a_r_i_a_b_l_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Performs an analysis of the sensitivity of a binary treatment (X)
     effect to an unmeasured binary confounder (U) for a fitted binary
     logistic or an unstratified non-time-dependent Cox survival model
     (the function works well for the former, not so well for the
     latter).  This is done by fitting a sequence of models with
     separately created U variables added to the original model.  The
     sequence of models is formed  by simultaneously varying a and b,
     where a measures the association between U and X and b measures
     the association between U and Y, where Y is the outcome of
     interest.  For Cox models, an approximate solution is used by
     letting Y represent some binary classification of the
     event/censoring time and the event indicator. For example, Y could
     be just be the event indicator, ignoring time of the event or
     censoring, or it could be 1 if a subject failed before one year
     and 0 otherwise. When for each combination of a and b the vector
     of binary values U is generated, one of two methods is used to
     constrain the properties of U.  With either method, the overall
     prevalence of U is constrained to be 'prev.u'. With the default
     method ('or.method="x:u y:u"'), U is sampled so that the X:U odds
     ratio is a and the Y:U odds ratio is b.  With the second method, U
     is sampled according to the model  logit(U=1 | X, Y) = alpha +
     beta*Y + gamma*X, where beta=log(b) and gamma=log(a) and alpha is
     determined so that the prevalence of U=1 is 'prev.u'.  This second
     method results in the adjusted odds ratio for Y:U given X being b
     whereas the default method forces the unconditional (marginal) Y:U
     odds ratio to be b.  Rosenbaum uses the default method.

     There is a 'plot' method for plotting objects created by 'sensuc'.
     Values of a are placed on the x-axis and observed marginal odds or
     hazards ratios for U (unadjusted ratios) appear on the y-axis. 
     For Cox models, the hazard ratios will not agree exactly with
     X:event indicator odds ratios but they sometimes be made close
     through judicious choice of the 'event' function.  The default
     plot uses four symbols which differentiate whether for the a,b
     combination the effect of X adjusted for U (and for any other
     covariables that were in the original model fit) is positive
     (usually meaning an effect ratio greater than 1) and
     "significant", merely positive, not positive and non significant,
     or not positive but significant.  There is also an option to draw
     the numeric value of the X effect ratio at the a,b combination
     along with its Z statistic underneath in smaller letters, and an
     option to draw the effect ratio in one of four colors depending on
     the significance of the Z statistic.

_U_s_a_g_e:

     # fit <- lrm(formula=y ~ x + other.predictors, x=TRUE, y=TRUE)  #or
     # fit <- cph(formula=Surv(event.time,event.indicator) ~ x + other.predictors,
     #            x=TRUE, y=TRUE)

     sensuc(fit,  
            or.xu=seq(1, 6, by = 0.5), or.u=or.xu, 
            prev.u=0.5, constrain.binary.sample=TRUE, 
            or.method=c("x:u y:u","u|x,y"),
            event=function(y) if(is.matrix(y))y[,ncol(y)] else 1*y)

     ## S3 method for class 'sensuc':
     plot(x,  ylim=c((1+trunc(min(x$effect.u)-.01))/
                        ifelse(type=='numbers',2,1),
                        1+trunc(max(x$effect.u)-.01)),
          xlab='Odds Ratio for X:U',
          ylab=if(x$type=='lrm')'Odds Ratio for Y:U' else
               'Hazard Ratio for Y:U',
          digits=2, cex.effect=.75, cex.z=.6*cex.effect,
          delta=diff(par('usr')[3:4])/40, 
          type=c('symbols','numbers','colors'),
          pch=c(15,18,5,0), col=c(2,3,1,4), alpha=.05,
          impressive.effect=function(x)x > 1,...)

_A_r_g_u_m_e_n_t_s:

     fit: result of 'lrm' or 'cph' with 'x=TRUE, y=TRUE'.  The first
          variable in the right hand side of the model formula must
          have been the binary X variable, and it may not interact with
          other predictors. 

       x: result of 'sensuc' 

   or.xu: vector of possible odds ratios measuring the X:U association. 

    or.u: vector of possible odds ratios measuring the Y:U association.
          Default is 'or.xu'. 

  prev.u: desired prevalence of U=1.  Default is 0.5, which is usually
          a "worst case" for sensitivity analyses. 

constrain.binary.sample: By default, the binary U values are sampled
          from the appropriate distributions conditional on Y and X so
          that the proportions of U=1 in each sample are exactly the
          desired probabilities, to within the closeness of
          ntimesprobability to an integer.  Specify
          'constrain.binary.sample=FALSE' to sample from ordinary
          Bernoulli distributions, to allow proportions of U=1 to
          reflect sampling fluctuations. 

or.method: see above 

   event: a function classifying the response variable into a binary
          event for the purposes of constraining the association
          between U and Y. For binary logistic models, 'event' is left
          at its default value, which is the identify function, i.e,
          the original Y values are taken as the events (no other
          choice makes any sense here).  For Cox models, the  default
          'event' function takes the last column of the 'Surv' object
          stored with the fit.  For rare events (high proportion of
          censored observations), odds ratios approximate hazard
          ratios, so the default is OK.   For other cases, the survival
          times should be considered (probably in conjunction with the
          event indicators), although it may not be possible to get a
          high enough hazard ratio between U and Y by sampling U by
          temporarily making Y binary.  See the last example which is
          for a 2-column 'Surv' object (first column of response
          variable=event time,  second=event indicator).  When
          dichotomizing survival time at a given point, it is
          advantageous to choose the cutpoint so that not many censored
          survival times preceed the cutpoint. Note that in fitting Cox
          models to examine sensitivity to U, the original
          non-dichotomized failure times are used. 

    ylim: y-axis limits for 'plot' 

    xlab: x-axis label 

    ylab: y-axis label 

  digits: number of digits to the right of the decimal point for
          drawing numbers on the plot, for 'type="numbers"' or
          'type="colors"'. 

cex.effect: character size for drawing effect ratios 

   cex.z: character size for drawing Z statistics 

   delta: decrement in y value used to draw Z values below effect
          ratios 

    type: specify '"symbols"' (the default), '"numbers"', or '"colors"'
          (see above) 

     pch: 4 plotting characters corresponding to positive and
          significant effects for X, positive and non-significant
          effects, not positive and not significant, not positive but
          significant 

     col: 4 colors as for 'pch' 

   alpha: significance level 

impressive.effect: a function of the odds or hazard ratio for X
          returning 'TRUE' for a positive effect.  By default, a
          positive effect is taken to mean a ratio exceeding one. 

     ...: optional arguments passed to 'plot' 

_V_a_l_u_e:

     'sensuc' returns an object of class '"sensuc"' with the following
     elements: 'OR.xu' (vector of desired X:U odds ratios or a values),
     'OOR.xu' (observed marginal X:U odds ratios), 'OR.u' (desired Y:U
     odds ratios or b values), 'effect.x' (adjusted odds or hazards
     ratio for X in a model adjusted for U and all of the other
     predictors), 'effect.u' (unadjusted Y:U odds or hazards ratios),
     'effect.u.adj' (adjusted Y:U odds or hazards ratios), Z
     (Z-statistics), 'prev.u' (input to 'sensuc'), 'cond.prev.u'
     (matrix with one row per a,b combination, specifying prevalences
     of U conditional on Y and X combinations), and 'type' ('"lrm"' or
     '"cph"').

_A_u_t_h_o_r(_s):

     Frank Harrell
      Mark Conaway
      Department of Biostatistics
      Vanderbilt University School of Medicine
      f.harrell@vanderbilt.edu, mconaway@virginia.edu

_R_e_f_e_r_e_n_c_e_s:

     Rosenbaum, Paul R (1995): Observational Studies.  New York:
     Springer-Verlag.

     Rosenbaum P, Rubin D (1983): Assessing sensitivity to an
     unobserved binary covariate in an observational study with binary
     outcome.  J Roy Statist Soc B 45:212-218.

_S_e_e _A_l_s_o:

     'lrm', 'cph', 'sample'

_E_x_a_m_p_l_e_s:

     set.seed(17)
     x <- sample(0:1, 500,TRUE)
     y <- sample(0:1, 500,TRUE)
     y[1:100] <- x[1:100]  # induce an association between x and y
     x2 <- rnorm(500)

     f <- lrm(y ~ x + x2, x=TRUE, y=TRUE)

     #Note: in absence of U odds ratio for x is exp(2nd coefficient)

     g <- sensuc(f, c(1,3))

     # Note: If the generated sample of U was typical, the odds ratio for
     # x dropped had U been known, where U had an odds ratio
     # with x of 3 and an odds ratio with y of 3

     plot(g)

     # Fit a Cox model and check sensitivity to an unmeasured confounder

     # f <- cph(Surv(d.time,death) ~ treatment + pol(age,2)*sex, x=TRUE, y=TRUE)
     # sensuc(f, event=function(y) y[,2] & y[,1] < 365.25 )
     # Event = failed, with event time before 1 year
     # Note: Analysis uses f$y which is a 2-column Surv object

