diseq                package:genetics                R Documentation

_E_s_t_i_m_a_t_e _o_r _C_o_m_p_u_t_e _C_o_n_f_i_d_e_n_c_e _I_n_t_e_r_v_a_l _f_o_r _t_h_e _S_i_n_g_l_e-_M_a_r_k_e_r _D_i_s_e_q_u_i_l_i_b_r_i_u_m

_D_e_s_c_r_i_p_t_i_o_n:

     Estimate or compute confidence interval for single-marker
     disequilibrium.

_U_s_a_g_e:

     diseq(x, ...)
     ## S3 method for class 'diseq':
     print(x, show=c("D","D'","r"), ...)
     diseq.ci(x, R=1000, conf=0.95, correct=TRUE, na.rm=TRUE, ...)

_A_r_g_u_m_e_n_t_s:

       x: genotype or haplotype object.

    show: a character value or vector indicating which disequilibrium
          measures should be displayed.  The default is to show all of
          the available measures.

    conf: Confidence level to use when computing the confidence level
          for D-hat.  Defaults to 0.95, should be in (0,1). 

       R: Number of bootstrap iterations to use when computing the
          confidence interval. Defaults to 1000.

 correct: See details.

   na.rm: logical. Should missing values be removed?

     ...: optional parameters passed to 'boot.ci' ('diseq.ci') or
          ignored.

_D_e_t_a_i_l_s:

     For a single-gene marker, 'diseq' computes the Hardy-Weinberg
     (dis)equilibrium statistic D, D', and r (the correlation
     coefficient) for each pair of allele values, as well as an overall
     value for each. 'print.diseq' displays the contents of a 'diseq'
     object. 'diseq.ci' computes a bootstrap confidence interval for
     this estimate.

     For each allele pair,

   _D is defined as the half of the raw difference in frequency between
        the observed number of heterozygotes and the expected number:

                 D = 1/2 * ( p(ij) - 2 * p(i)*p(j) )


   _D' rescales D to span the range [-1,1] 

                            D' = D / Dmax

        where, if D > 0:

                   Dmax = min( p(i),p(j) ) - p(ij)

        or if D < 0:

                             Dmax = p(ij)


   _r is the correlation coefficient between the two alleles ignoring
        all other alleles, and can be computed by

    r = -D / sqrt( p(i)*(1-p(i)) * p(j)*(1-p(j)) ) = -D / p(i)p(j)


     where

   - p(i) defined as the observed probability of allele 'i', 

   - p(j) defined as the observed probability of allele 'j', and 

   - p(ij) defined as the observed probability of the allele pair 'ij'. 

     When there are more than two alleles, the summary values for these
     statistics are obtained by computing a weighted average of the
     absolute value of each allele pair, where the weight is determined
     by the expected frequency. For example:


                   D.overall = sum |D(ij)| * p(ij)


     Bootstrapping is used to generate confidence interval in order to
     avoid reliance on parametric assumptions, which will not hold for
     alleles with low frequencies (e.g. D' following a a Chi-square 
     distribution).  

     See the function 'HWE.test' for testing Hardy-Weinberg
     Equilibrium, D=0.

_V_a_l_u_e:

     'diseq' returns an object of class 'diseq' with components 

    data: 2-way table of allele pair counts

   D.hat: matrix giving the observed count, expected count, observed -
          expected difference, and estimate of disequilibrium for each
          pair of alleles as well as an overall disequilibrium value.

    call: function call used to create this object

     normal-bracket97bracket-normal

     'diseq.ci' returns an object of class 'bootci'

_A_u_t_h_o_r(_s):

     Gregory R. Warnes Gregory_R_Warnes@groton.pfizer.com 

_S_e_e _A_l_s_o:

     'genotype', 'HWE.test', 'boot', 'bootci'

_E_x_a_m_p_l_e_s:

     example.data   <- c("D/D","D/I","D/D","I/I","D/D",
                         "D/D","D/D","D/D","I/I","")
     g1  <- genotype(example.data)
     g1

     diseq(g1)
     diseq.ci(g1)
     HWE.test(g1)  # does the same, plus tests D-hat=0

     three.data   <- c(rep("A/A",8),
                       rep("C/A",20),
                       rep("C/T",20),
                       rep("C/C",10),
                       rep("T/T",3))

     g3  <- genotype(three.data)
     g3

     diseq(g3)
     diseq.ci(g3, ci.B=10000, ci.type="bca")

