svydesign               package:survey               R Documentation

_S_u_r_v_e_y _s_a_m_p_l_e _a_n_a_l_y_s_i_s.

_D_e_s_c_r_i_p_t_i_o_n:

     Specify a complex survey design.

_U_s_a_g_e:

     svydesign(ids, probs=NULL, strata = NULL, variables = NULL, fpc=NULL,
     data = NULL, nest = FALSE, check.strata = !nest, weights=NULL) 

_A_r_g_u_m_e_n_t_s:

     ids: Formula or data frame specifying cluster ids from largest
          level to smallest level, '~0' or '~1' is a formula for no
          clusters.

   probs: Formula or data frame specifying cluster sampling
          probabilities

  strata: Formula or vector specifying strata, use 'NULL' for no strata

variables: Formula or data frame specifying the variables measured in
          the survey. If 'NULL', the 'data' argument is used.

     fpc: Finite population correction: see Details below

 weights: Formula or vector specifying sampling weights as an
          alternative to 'prob'

    data: Data frame to look up variables in the formula arguments

    nest: If 'TRUE', relabel cluster ids to enforce nesting, eg if ids
          at second level of sampling are reused within first-level
          units

check.strata: If 'TRUE', check that clusters are nested in strata

     .

_D_e_t_a_i_l_s:

     When analysing data from a complex survey, observations must be
     weighted inversely to their sampling probabilities, and the
     effects of stratification and of correlation induced by cluster
     sampling must be incorporated in standard errors.

     The 'svydesign' object combines a data frame and all the survey
     design information needed to analyse it.  These objects are used
     by the survey modelling and summary functions.

     The finite population correction is used to reduce the variance
     when a substantial fraction of the total population of interest
     has been sampled. It may not be appropriate if the target of
     inference is the process generating the data rather than the
     statistics of a particular finite population.

     The finite population correction can be specified either as the
     total population size in each stratum or as the fraction of the
     total population that has been sampled. In either case the
     relevant population size is `primary sampling units', the largest
     clusters. That is, sampling 100 units from a population stratum of
     size 500 can be specified as 100 or as 100/500=0.2.  The finite
     population correction can be specified by a vector with one
     element for each individual (in which case it is an error for it
     to vary within a stratum) or as a data frame with one row per
     stratum.  The first column of the data frame should be a factor
     with the same levels as 'strata' and the second column the finite
     population correction.

     If population sizes are specified but not sampling probabilities
     or weights, the sampling probabilities will be computed from the
     population sizes assuming simple random sampling within strata.

     The 'dim', '"["', '"[<-"' and na.action methods for
     'survey.design' objects operate on the dataframe specified by
     'variables' and ensure that the design information is properly
     updated to correspond to the new data frame.  With the '"[<-"'
     method the new value can be a 'survey.design' object instead of a
     data frame, but only the data frame is used. See also
     'subset.survey.design' for a simple way to select subpopulations.

     The value of 'options("survey.lonely.psu")' controls what happens
     to strata containing only one cluster (PSU).See 'svyCprod' for
     details, especially if you have self-representing ("certainty")
     PSUs.

_V_a_l_u_e:

     An object of class 'survey.design'.

_A_u_t_h_o_r(_s):

     Thomas Lumley

_S_e_e _A_l_s_o:

     'svyglm', 'svymean', 'svyvar', 'svytable', 'svyquantile',
     'subset.survey.design, \code{update.survey.design}'

_E_x_a_m_p_l_e_s:

       data(api)
     # stratified sample
     dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
     # one-stage cluster sample
     dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
     # two-stage cluster sample
     dclus2<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2)

     ## syntax for stratified cluster sample
     ##(though the data weren't really sampled this way)
     svydesign(id=~dnum, strata=~stype, weights=~pw, data=apistrat, nest=TRUE)

