coding               package:meanscore               R Documentation

_c_o_m_b_i_n_e_s _t_w_o _o_r _m_o_r_e _s_u_r_r_o_g_a_t_e/_a_u_x_i_l_i_a_r_y _v_a_r_i_a_b_l_e_s _i_n_t_o _a _v_e_c_t_o_r

_D_e_s_c_r_i_p_t_i_o_n:

     recodes a matrix of categorical variables into a vector which
     takes  a unique value for each combination 

     *BACKGROUND*

     From the matrix Z of first-stage covariates, this function creates
      a vector which takes a unique value for each combination as
     follows:

       z1  z2  z3  new.z
        0   0   0      1
        1   0   0      2
        0   1   0      3
        1   1   0      4
        0   0   1      5
        1   0   1      6
        0   1   1      7
        1   1   1      8

     If some of the combinations do not exist, the function will adjust
     accordingly: for example if the combination (0,1,1) is absent
     above, then (1,1,1) will be coded as 7. 

     The values of this new.z are reported as 'new.z' in the printed
     output  (see 'value' below) 

     This function should be run on second stage data prior to using
     the 'ms.nprev' function, as it illustrates the order  in which the
     call to ms.nprev expects the first-stage sample sizes to be
     provided.

_U_s_a_g_e:

     coding(x=x,y=y,z=z,return=FALSE)

_A_r_g_u_m_e_n_t_s:

     REQUIRED ARGUMENTS

       y: response variable (should be binary 0-1)

       x: matrix of predictor variables for regression model

       z: matrix of any surrogate or auxiliary variables 

          OPTIONAL ARGUMENTS

  return: logical value; if it's TRUE(T) the original surrogate or
          auxiliary variables and the re-coded auxilliary  variables
          will be returned.    The default is FALSE (F).  

_V_a_l_u_e:

     This function does not return any values *except* if 'return'=T. 

     If used with only second stage (i.e. complete) data, it will print
     the  following: 

  ylevel: the distinct values (or levels) of y

*z*1 ... *z*i: the distinct values of first stage variables  *z*1 ...
          *z*i

   new.z: recoded first stage variables. Each value represents a unique
          combination of  first stage variable values.

      n2: second stage sample sizes in each ('ylevel','new.z') stratum. 

          If used with combined first and second stage data (i.e. with
          NA for  missing values), in addition to the above items, the
          function will also print the following:

      n1: first-stage sample sizes in each ('ylevel','new.z') stratum.

_S_e_e _A_l_s_o:

     'meanscore','ms.nprev', 'ectopic','simNA','glm'.

_E_x_a_m_p_l_e_s:

     ## Not run: 
     The ectopic data set has 3 categorical first-stage variables in columns 
     3 to 5, which together with column 2 are the predictor variables of the
     dichotomous outcome in column 1 (see help(ectopic) for further details). Typing
     ## End(Not run)
     data(ectopic)
     coding(x=ectopic[,2:5],y=ectopic[,1], z=ectopic[,3:5])

     ## Not run: 
     gives the following coding scheme and first-stage and second-stage 
     sample sizes (n1 and n2 respectively)
     ## End(Not run)

     ## Not run: 
      ylevel gonnorhoea contracept sexpatr new.z  n1 n2
           0          0          0       0     1  56 13
           0          0          1       0     2 146 36
           0          0          0       1     3 119 33
           0          1          0       1     4  19  8
           0          0          1       1     5 344 93
           0          1          1       1     6  31  9
           1          0          0       0     1  26 11
           1          0          1       0     2   9  5
           1          0          0       1     3 160 79
           1          1          0       1     4  29 18
           1          0          1       1     5  35 20
           1          1          1       1     6   5  2
     ## End(Not run)

