somers2                package:Hmisc                R Documentation

_S_o_m_e_r_s' _D_x_y _R_a_n_k _C_o_r_r_e_l_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Computes Somers' Dxy rank correlation between a variable 'x' and a
     binary (0-1) variable 'y', and the corresponding receiver
     operating characteristic curve area 'c'. Note that 'Dxy =
     2(c-0.5)'.   'somers' allows for a 'weights' variable, which
     specifies frequencies to associate with each observation.

_U_s_a_g_e:

     somers2(x, y, weights=NULL, normwt=FALSE, na.rm=TRUE)

_A_r_g_u_m_e_n_t_s:

       x: typically a predictor variable. 'NA's are allowed. 

       y: a numeric outcome variable coded '0-1'. 'NA's are allowed. 

 weights: a numeric vector of observation weights (usually
          frequencies).  Omit or specify a zero-length vector to do an
          unweighted analysis. 

  normwt: set to 'TRUE' to make 'weights' sum to the actual number of
          non-missing observations. 

   na.rm: set to 'FALSE' to suppress checking for NAs. 

_D_e_t_a_i_l_s:

     The 'rcorr.cens' function, which although slower than 'somers2'
     for large sample sizes, can also be used to obtain Dxy for
     non-censored binary 'y', and it has the advantage of computing the
     standard deviation of the correlation index.

_V_a_l_u_e:

     a vector with the named elements 'C', 'Dxy', 'n' (number of
     non-missing pairs), and 'Missing'. Uses the formula  'C =
     (mean(rank(x)[y == 1]) - (n1 + 1)/2)/(n - n1)', where 'n1' is the
     frequency of 'y=1'.

_A_u_t_h_o_r(_s):

     Frank Harrell 
      Department of Biostatistics 
      Vanderbilt University School of Medicine 
      f.harrell@vanderbilt.edu

_S_e_e _A_l_s_o:

     'rcorr.cens', 'rank', 'wtd.rank',

_E_x_a_m_p_l_e_s:

     set.seed(1)
     predicted <- runif(200)
     dead      <- sample(0:1, 200, TRUE)
     roc.area <- somers2(predicted, dead)["C"]

