chisq                package:corpora                R Documentation

_P_e_a_r_s_o_n'_s _c_h_i-_s_q_u_a_r_e_d _s_t_a_t_i_s_t_i_c _f_o_r _f_r_e_q_u_e_n_c_y _c_o_m_p_a_r_i_s_o_n_s (_c_o_r_p_o_r_a)

_D_e_s_c_r_i_p_t_i_o_n:

     This function computes Pearson's chi-squared statistic (often
     written as X^2) for frequency comparison data, with or without
     Yates' continuity correction.  The implementation is based on the
     formula given by Evert (2004, 82).

_U_s_a_g_e:

     chisq(k1, n1, k2, n2, correct = TRUE, one.sided=FALSE)

_A_r_g_u_m_e_n_t_s:

      k1: frequency of a type in the first corpus (or an integer vector
          of type frequencies)

      n1: the sample size of the first corpus (or an integer vector
          specifying the sizes of different samples)

      k2: frequency of the type in the second corpus (or an integer
          vector of type frequencies, in parallel to 'k1')

      n2: the sample size of the second corpus (or an integer vector
          specifying the sizes of different samples, in parallel to
          'n1')

 correct: if 'TRUE', apply Yates' continuity correction (default)

one.sided: if 'TRUE', compute the _signed square root_ of X^2 as a
          statistic for a one-sided test (see details below; the
          default value is 'FALSE')

_D_e_t_a_i_l_s:

     The X^2 values returned by this function are identical to those
     computed by 'chisq.test'.  Unlike the latter, 'chisq' accepts
     vector arguments so that a large number of frequency comparisons
     can be carried out with a single function call.

     The one-sided test statistic (for 'one.sided=TRUE') is the signed
     square root of X^2.  It is positive for k_1/n_1 > k_2/n_2 and
     negative for k_1/n_1 < k_2/n_2.  Note that this statistic has a
     _standard normal distribution_ rather than a chi-squared
     distribution under the null hypothesis of equal proportions.

_V_a_l_u_e:

     The chi-squared statistic X^2 corresponding to the specified data
     (or a vector of X^2 values).  This statistic has a _chi-squared
     distribution_ with df=1 under the null hypothesis of equal
     proportions.

_A_u_t_h_o_r(_s):

     Stefan Evert

_R_e_f_e_r_e_n_c_e_s:

     Evert, Stefan (2004). _The Statistics of Word Cooccurrences: Word
     Pairs and Collocations._  Ph.D. thesis, Institut fr maschinelle
     Sprachverarbeitung, University of Stuttgart.  Published in 2005,
     URN urn:nbn:de:bsz:93-opus-23714. Available from <URL:
     http://www.collocations.de/phd.html>.

_S_e_e _A_l_s_o:

     'chisq.pval', 'chisq.test', 'cont.table'

