svytable               package:survey               R Documentation

_C_o_n_t_i_n_g_e_n_c_y _t_a_b_l_e_s _f_o_r _s_u_r_v_e_y _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     Contingency tables and chisquared tests of association for survey
     data.

_U_s_a_g_e:

     ## S3 method for class 'survey.design':
     svytable(formula, design, Ntotal = NULL, round = FALSE,...)
     ## S3 method for class 'svyrep.design':
     svytable(formula, design, Ntotal = sum(weights(design, "sampling")), round = FALSE,...)
     ## S3 method for class 'survey.design':
     svychisq(formula, design, statistic = c("F",  "Chisq","Wald","adjWald","lincom","saddlepoint"),na.rm=TRUE,...)
     ## S3 method for class 'svyrep.design':
     svychisq(formula, design, statistic = c("F",  "Chisq","Wald","adjWald","lincom","saddlepoint"),na.rm=TRUE,...)
     ## S3 method for class 'svytable':
     summary(object, statistic = c("F",
     "Chisq","Wald","adjWald","lincom","saddlepoint"),...)
     degf(design, ...)
     ## S3 method for class 'survey.design2':
     degf(design, ...)
     ## S3 method for class 'svyrep.design':
     degf(design, tol=1e-5,...)

_A_r_g_u_m_e_n_t_s:

 formula: Model formula specifying margins for the table (using '+'
          only)

  design: survey object

statistic: See Details below

  Ntotal: A population total or set of population stratum totals to
          normalise to.

   round: Should the table entries be rounded to the nearest integer?

   na.rm: Remove missing values

  object: Output from 'svytable'

     ...: Other arguments for future expansion

     tol: Tolerance for 'qr' in computing the matrix rank

_D_e_t_a_i_l_s:

     The 'svytable' function computes a weighted crosstabulation.  In
     many cases it is easier to use 'svytotal' or 'svymean', which also
     produce standard errors, design effects, etc.

     The frequencies in the table can be normalised to some convenient
     total such as 100 or 1.0 by specifying the 'Ntotal' argument.  If
     the formula has a left-hand side the mean or sum of this variable
     rather than the frequency is tabulated.

     The 'Ntotal' argument can be either a single number or a data
     frame whose first column gives the (first-stage) sampling strata
     and second column the population size in each stratum.  In this
     second case the 'svytable' command performs `post-stratification':
     tabulating and scaling to the population within strata and then
     adding up the strata.

     As with other 'xtabs' objects, the output of 'svytable' can be
     processed by 'ftable' for more attractive display. The 'summary'
     method for 'svytable' objects calls 'svychisq' for a test of
     independence.

     'svychisq' computes first and second-order Rao-Scott corrections
     to the Pearson chisquared test, and two Wald-type tests.

     The default ('statistic="F"') is the Rao-Scott second-order
     correction.  The p-values are computed with a Satterthwaite
     approximation to the distribution.  The alternative
     'statistic="Chisq"' adjusts the Pearson chisquared statistic by a
     design effect estimate and then compares it to the chisquared
     distribution it would have under simple random sampling.

     The 'statistic="Wald"' test is that proposed by Koch et al (1975)
     and used by the SUDAAN software package. It is a Wald test based
     on the differences between the observed cells counts and those
     expected under independence. The adjustment given by
     'statistic="adjWald"' reduces the statistic when the number of
     PSUs is small compared to the number of degrees of freedom of the
     test. Rao and Thomas (1990) compare these tests and find the
     adjustment benefical.

     'statistic="lincom"' uses the exact asymptotic distribution, which
     is a linear combination of chi-squared variables (see 'pchisqsum',
     and 'statistic="saddlepoint"' uses a saddlepoint approximation to
     this distribution.

     For designs using replicate weights the code is essentially the
     same as for designs with sampling structure, since the necessary
     variance computations are done by the appropriate methods of
     'svytotal' and 'svymean'.  The exception is that the degrees of
     freedom is computed as one less than the rank of the matrix of
     replicate weights (by 'degf').

     At the moment, 'svychisq' works only for 2-dimensional tables.

_V_a_l_u_e:

     The table commands return an 'xtabs' object, 'svychisq' returns a
     'htest' object.

_N_o_t_e:

     Rao and Scott (1984) leave open one computational issue. In
     computing `generalised design effects' for these tests, should the
     variance under simple random sampling be estimated using the
     observed proportions or the the predicted proportions under the
     null hypothesis? 'svychisq' uses the observed proportions,
     following simulations by Sribney (1998), and the choices made in
     Stata

_R_e_f_e_r_e_n_c_e_s:

     Davies RB (1973). "Numerical inversion of a characteristic
     function" Biometrika 60:415-7

     Koch, GG, Freeman, DH, Freeman, JL (1975) "Strategies in the
     multivariate analysis of data from complex surveys" International
     Statistical Review 43: 59-78

     Rao, JNK, Scott, AJ (1984) "On Chi-squared Tests For Multiway
     Contigency Tables with Proportions Estimated From Survey Data" 
     Annals of Statistics 12:46-60.

     Sribney WM (1998) "Two-way contingency tables for survey or
     clustered data" Stata Technical Bulletin 45:33-49.

     Thomas, DR, Rao, JNK (1990) "Small-sample comparison of level and
     power for simple goodness-of-fit statistics under cluster
     sampling" JASA 82:630-636

_S_e_e _A_l_s_o:

     'svytotal' and 'svymean' report totals and proportions by category
     for factor variables.

     See 'svyby' and 'ftable.svystat' to construct more complex tables
     of summary statistics.

     See 'svyloglin' for loglinear models.

_E_x_a_m_p_l_e_s:

       data(api)
       xtabs(~sch.wide+stype, data=apipop)

       dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
       summary(dclus1)

       (tbl <- svytable(~sch.wide+stype, dclus1))
       svychisq(~sch.wide+stype, dclus1)
       summary(tbl, statistic="Chisq")
       svychisq(~sch.wide+stype, dclus1, statistic="adjWald")

       rclus1 <- as.svrepdesign(dclus1)
       summary(svytable(~sch.wide+stype, rclus1))
       svychisq(~sch.wide+stype, rclus1, statistic="adjWald")

