genecounting               package:gap               R Documentation

_G_e_n_e _c_o_u_n_t_i_n_g _f_o_r _h_a_p_l_o_t_y_p_e _a_n_a_l_y_s_i_s

_D_e_s_c_r_i_p_t_i_o_n:

     Gene counting for haplotype analysis with missing data

_U_s_a_g_e:

     genecounting(data,weight=NULL,loci=NULL,control=gc.control())

_A_r_g_u_m_e_n_t_s:

    data: genotype table

  weight: a column of frequency weights

    loci: an array containing number of alleles at each locus

 control: is a function with the following arguments: 

        _x_d_a_t_a a flag indicating if the data involves X chromosome, if
             so, the first column of data indicates sex of each
             subject: 1=male, 2=female. The marker  data are no
             different from the autosomal version for females, but for
             males, two copies of the single allele present at a given
             locus.

        _c_o_n_v_l_l set convergence criteria according to log-likelihood, if
             its value set to 1

        _h_a_n_d_l_e._m_i_s_s to handle missing data, if its value set to 1

        _e_p_s the actual convergence criteria, with default value 1e-5

        _t_o_l tolerance for genotype probabilities with default value
             1e-8

        _m_a_x_i_t maximum number of iterations, with default value 50

        _p_l criteria for trimming haplotypes according to posterior
             probabilities

        _v_e_r_b_o_s_e If T, yield print out from the C routine

_V_a_l_u_e:

     The returned value is a list containing:

       h: haplotype frequency estimates under linkage disequilibrium
          (LD)

      h0: haplotype frequency estimates under linkage equilibrium (no
          LD)

    prob: genotype probability estimates

      l0: log-likelihood under linkage equilibrium

      l1: log-likelihood under linkage disequilibrium

   hapid: unique haplotype identifier (defunct, see gc.em)

   npusr: number of parameters according user-given alleles

   npdat: number of parameters according to observed

htrtable: design matrix for haplotype trend regression (defunct, see
          gc.em)

    iter: number of iterations used in gene counting

converge: a flag indicating convergence status of gene counting

     di0: haplotype diversity under no LD, defined as 1-sum (h0^2)

     di1: haplotype diversity under LD, defined as 1-sum (h^2)

   resid: residuals in terms of frequency weights = o - e

_R_e_f_e_r_e_n_c_e_s:

     Zhao, J. H., Lissarrague, S., Essioux, L. and P. C. Sham (2002).
     GENECOUNTING: haplotype analysis with missing genotypes.
     Bioinformatics 18(12):1694-1695

     Zhao, J. H. and P. C. Sham (2003). Generic number systems and
     haplotype analysis. Comp Meth Prog Biomed 70: 1-9

     Zhao, J. H. (2004). 2LD, GENECOUNTING and HAP: Computer programs
     for linkage disequilibrium analysis. Bioinformatics, 20, 1325-1326

_N_o_t_e:

     adapted from GENECOUNTING

_A_u_t_h_o_r(_s):

     Jing Hua Zhao

_S_e_e _A_l_s_o:

     'gc.em', 'kbyl'

_E_x_a_m_p_l_e_s:

     ## Not run: 
     # HLA data
     data(hla)
     hla.gc <- genecounting(hla[,3:8])
     summary(hla.gc)
     hla.gc$l0
     hla.gc$l1

     # ALDH2 data
     data(aldh2)
     control <- gc.control(handle.miss=1)
     aldh2.gc <- genecounting(aldh2[,3:6],control=control)
     summary(aldh2.gc)
     aldh2.gc$l0
     aldh2.gc$l1

     # Chromosome X data
     # assuming allelic data have been extracted in columns 3-13
     # and column 3 is sex
     dat <- mao[,3:13]
     loci <- c(12,9,6,5,3)
     contr <- gc.control(xdata=TRUE,handle.miss=1)
     mao.gc <- genecounting(dat,loci=loci,control=contr)
     mao.gc$npusr
     mao.gc$npdat
     ## End(Not run)

