LD                 package:genetics                 R Documentation

_P_a_i_r_w_i_s_e _l_i_n_k_a_g_e _d_i_s_e_q_u_i_l_i_b_r_i_u_m _b_e_t_w_e_e_n _g_e_n_e_t_i_c _m_a_r_k_e_r_s.

_D_e_s_c_r_i_p_t_i_o_n:

     Compute pairwise linkage disequilibrium between genetic markers

_U_s_a_g_e:

     LD(g1, ...)
     ## S3 method for class 'genotype':
     LD(g1,g2,...)
     ## S3 method for class 'data.frame':
     LD(g1,...)

_A_r_g_u_m_e_n_t_s:

      g1: genotype object or dataframe containing genotype objects 

      g2: genotype object (ignored if g1 is a dataframe) 

     ...: optional arguments (ignored) 

_D_e_t_a_i_l_s:

     Linkage disequilibrium (LD) is the non-random association of
     marker alleles and can arise from marker proximity or from
     selection bias.

     'LD.genotype' estimates the extent of LD for a single pair of
     genotypes.  'LD.data.frame' computes LD for all pairs of genotypes
     contained in a data frame.  Before starting, 'LD.data.frame'
     checks the class and number of alleles of each variable in the
     dataframe.  If the data frame contains non-genotype objects or
     genotypes with more or less than 2 alleles, these will be omitted
     from the computation and a warning will be generated.

     Three estimators of LD are computed:


   _D raw difference in frequency between the observed number of AB
        pairs and the expected number:

                        D = p(AB) - p(A)*p(B)


   _D' scaled D spanning the range [-1,1] 

                            D' = D / Dmax

        where, if D > 0:

                   Dmax = min( p(A)p(b), p(a)p(B) )

        or if D < 0:

                  Dmax = max( -p(A)p(B), -p(a)p(b) )


   _r correlation coefficient between the markers

              r = -D / sqrt( p(A) * p(a) * p(B) * p(b) )


     where

   - p(A) is defined as the observed probability of allele 'A' for
        marker 1, 

   - p(a) = 1-p(A) is defined as the observed probability of allele 'a'
        for marker 1, 

   - p(B) is defined as the observed probability of allele 'B' for
        marker 2, and 

   - p(b) = 1- p(B) is defined as the observed probability of allele
        'b' for marker 2, and 

   - p(AB) is defined as the probability of the marker allele pair
        'AB'. 

     For genotype data, AB/ab cannot be distinguished from aB/Ab.
     Consequently, we estimate p(AB) using maximum likelihood and use
     this value in the computations.

_V_a_l_u_e:

     'LD.genotype' returns a 5 element list: 

    call: the matched call

       D: Linkage disequilibrium estimate

 Dprime : Scaled linkage disequilibrium estimate

    corr: Correlation coefficient

    nobs: Number of observations

   chisq: Chi-square statistic for linkage equilibrium (i.e.,
          D=D'=corr=0)

 p.value: Chi-square p-value for marker independence


     'LD.data.frame' returns a list with the same elements, but each
     element is a matrix where the upper off-diagonal elements contain
     the estimate for the corresponding pair of markers.  The other
     matrix elements are 'NA'.

_A_u_t_h_o_r(_s):

     Gregory R. Warnes gregory_r_warnes@groton.pfizer.com

_S_e_e _A_l_s_o:

     'genotype', 'HWE.test'

_E_x_a_m_p_l_e_s:

     g1 <- genotype( c('T/A',    NA, 'T/T',    NA, 'T/A',    NA, 'T/T', 'T/A',
                       'T/T', 'T/T', 'T/A', 'A/A', 'T/T', 'T/A', 'T/A', 'T/T',
                          NA, 'T/A', 'T/A',   NA) )

     g2 <- genotype( c('C/A', 'C/A', 'C/C', 'C/A', 'C/C', 'C/A', 'C/A', 'C/A',
                       'C/A', 'C/C', 'C/A', 'A/A', 'C/A', 'A/A', 'C/A', 'C/C',
                       'C/A', 'C/A', 'C/A', 'A/A') )

     g3 <- genotype( c('T/A', 'T/A', 'T/T', 'T/A', 'T/T', 'T/A', 'T/A', 'T/A',
                       'T/A', 'T/T', 'T/A', 'T/T', 'T/A', 'T/A', 'T/A', 'T/T',
                       'T/A', 'T/A', 'T/A', 'T/T') )

     # Compute LD on a single pair

     LD(g1,g2)

     # Compute LD table for all 3 genotypes

     data <- makeGenotypes(data.frame(g1,g2,g3))
     LD(data)

