diseq                package:genetics                R Documentation

_E_s_t_i_m_a_t_e _o_r _C_o_m_p_u_t_e _C_o_n_f_i_d_e_n_c_e _I_n_t_e_r_v_a_l _f_o_r _t_h_e _S_i_n_g_l_e-_M_a_r_k_e_r _D_i_s_e_q_u_i_l_i_b_r_i_u_m

_D_e_s_c_r_i_p_t_i_o_n:

     Estimate or compute confidence interval for single-marker
     disequilibrium.

_U_s_a_g_e:

     diseq(x, ...)
     ## S3 method for class 'diseq':
     print(x, show=c("D","D'","r","R^2","table"), ...)
     diseq.ci(x, R=1000, conf=0.95, correct=TRUE, na.rm=TRUE, ...)

_A_r_g_u_m_e_n_t_s:

       x: genotype or haplotype object.

    show: a character value or vector indicating which disequilibrium
          measures should be displayed.  The default is to show all of
          the available measures. 'show="table"' will display a table
          of observed, expected, and observed-expected frequencies.

    conf: Confidence level to use when computing the confidence level
          for D-hat.  Defaults to 0.95, should be in (0,1). 

       R: Number of bootstrap iterations to use when computing the
          confidence interval. Defaults to 1000.

 correct: See details.

   na.rm: logical. Should missing values be removed?

     ...: optional parameters passed to 'boot.ci' ('diseq.ci') or
          ignored.

_D_e_t_a_i_l_s:

     For a single-gene marker, 'diseq' computes the Hardy-Weinberg
     (dis)equilibrium statistic D, D', r (the correlation coefficient),
     and r^2 for each pair of allele values, as well as an overall
     summary value for each measure across all alleles.  'print.diseq'
     displays the contents of a 'diseq' object. 'diseq.ci' computes a
     bootstrap confidence interval for this estimate.

     For consistency, I have applied the standard definitions for D,
     D', and r from the Linkage Disequilibrium case, replacing all
     marker  probabilities with the appropriate allele probabilities.

     Thus, for each allele pair,

   _D is defined as the half of the raw difference in frequency between
        the observed number of heterozygotes and the expected number:

               D = 1/2 * ( p(ij) + p(ji) ) - p(i)*p(j)


   _D' rescales D to span the range [-1,1] 

                            D' = D / Dmax

        where, if D > 0:

              Dmax = min(p(i)p(j), p(j)p(i)) =  p(i)p(j)

        or if D < 0:

        Dmax = min( p(i) * (1 - p(j)), p(j)( 1 - (1-p(i) ) ) )


   _r is the correlation coefficient between two alleles, and can be
        computed by

            r = -D / sqrt( p(i)*(1-p(i)) * p(j)*(1-p(j)) )


     where

   - p(i) defined as the observed probability of allele 'i', 

   - p(j) defined as the observed probability of allele 'j', and 

   - p(ij) defined as the observed probability of the allele pair 'ij'. 

     When there are more than two alleles, the summary values for these
     statistics are obtained by computing a weighted average of the
     absolute value of each allele pair, where the weight is determined
     by the expected frequency. For example:


                   D.overall = sum |D(ij)| * p(ij)


     Bootstrapping is used to generate confidence interval in order to
     avoid reliance on parametric assumptions, which will not hold for
     alleles with low frequencies (e.g. D' following a a Chi-square 
     distribution).  

     See the function 'HWE.test' for testing Hardy-Weinberg
     Equilibrium, D=0.

_V_a_l_u_e:

     'diseq' returns an object of class 'diseq' with components 

    data: 2-way table of allele pair counts

   D.hat: matrix giving the observed count, expected count, observed -
          expected difference, and estimate of disequilibrium for each
          pair of alleles as well as an overall disequilibrium value.

    call: function call used to create this object

     normal-bracket98bracket-normal

     'diseq.ci' returns an object of class 'bootci'

_A_u_t_h_o_r(_s):

     Gregory R. Warnes Gregory_R_Warnes@groton.pfizer.com 

_S_e_e _A_l_s_o:

     'genotype', 'HWE.test', 'boot', 'bootci'

_E_x_a_m_p_l_e_s:

     example.data   <- c("D/D","D/I","D/D","I/I","D/D",
                         "D/D","D/D","D/D","I/I","")
     g1  <- genotype(example.data)
     g1

     diseq(g1)
     diseq.ci(g1)
     HWE.test(g1)  # does the same, plus tests D-hat=0

     three.data   <- c(rep("A/A",8),
                       rep("C/A",20),
                       rep("C/T",20),
                       rep("C/C",10),
                       rep("T/T",3))

     g3  <- genotype(three.data)
     g3

     diseq(g3)
     diseq.ci(g3, ci.B=10000, ci.type="bca")

     # only show observed vs expected table
     print(diseq(g3),show='table')

