makekinship             package:kinship             R Documentation

_C_r_e_a_t_e _a _s_p_a_r_s_e _k_i_n_s_h_i_p _m_a_t_r_i_x

_D_e_s_c_r_i_p_t_i_o_n:

     Compute the overall kinship matrix for a collection of families,
     and store it efficiently.

_U_s_a_g_e:

     makekinship(famid, id, father.id, mother.id, unrelated=0)

_A_r_g_u_m_e_n_t_s:

   famid: a vector of family identifiers 

      id: a vector of unique subject identifiers 

father.id : for each subject, the identifier of their biolgical father 

mother.id : for each subject, the identifier of thier biological mother 

unrelated: subjects with this family id are considered to be unrelated
          singletons, i.e., not related to each other or to anyone
          else. 

_D_e_t_a_i_l_s:

     For each family of more than one member, the  'kinship' function
     is called to calculate a per-family kinship matrix. These are
     stored in an efficient way into a single block-diagaonal sparse
     matrix object, taking advantage of the fact that between family
     entries in the full matrix are all 0. Unrelated individuals are
     considered to be families of size 0, and are placed first in the
     matrix.

     The final order of the rows within this matrix will not
     necessarily be the same as in the origianl data, since each family
     must be contiguous. The dimnames of the matrix contain the id
     variable for each row/column. Also note that to create the kinship
     matrix for a subset of the data it is necessary to create the full
     kinship matrix first and then subset it.   One cannot first subset
     the data and then call the function. For instance, a call using
     only the female data would not detect that a particular man's
     sister and his daughter are related.

_V_a_l_u_e:

     a sparse kinship matrix of class 'bdsmatrix'

_S_e_e _A_l_s_o:

     kinship, makefamid

_E_x_a_m_p_l_e_s:

     ## Not run: 
     # Data set from a large family study of breast cancer
     #  there are 26050 subjects in the file, from 426 families
     > table(cdata$sex)
          F     M 
      12699 13351
     > length(unique(cdata$famid))
     [1] 426

     > kin1 <- makekinship(cdata$famid, cdata$gid, cdata$dadid, cdata$momid)
     > dim(kin1)
     [1] 26050 26050
     > class(kin1)
     [1] "bdsmatrix"
     # The next line shows that few of the elements of the full matrix are >0
     > length(kin1@blocks)/ prod(dim(kin1))
     [1] 0.00164925

     # kinship matrix for the females only
     > femid <- cdata$gid[cdata$sex=='F']
     > femindex <- !is.na(match(dimnames(kin1)[[1]], femid))
     > kin2 <- kin1[femindex, femindex]
     #
     # Note that "femindex <- match(femid, dimnames(kin1)[[1]])" is wrong, since
     #  then kin1[femindex, femindex] might improperly reorder the rows/cols 
     #  (if families were not contiguous in cdata).  
     # However sort(match(femid, dimnames(kin1)[[1]])) would be okay.
     ## End(Not run)

