svyCprod               package:survey               R Documentation

_C_o_m_p_u_t_a_t_i_o_n_s _f_o_r _s_u_r_v_e_y _v_a_r_i_a_n_c_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Computes the sum of products needed for the variance of survey
     sample estimators.  'svyCprod' is used for survey design objects
     from before version 2.9, 'onestage' is called by 'svyrecvar' for
     post-2.9 design objects.

_U_s_a_g_e:

     svyCprod(x, strata, psu, fpc, nPSU,certainty=NULL, postStrata=NULL,
           lonely.psu=getOption("survey.lonely.psu"))
     onestage(x, strata, clusters, nPSU, fpc,
           lonely.psu=getOption("survey.lonely.psu"),stage=0,cal)

_A_r_g_u_m_e_n_t_s:

       x: A vector or matrix

  strata: A vector of stratum indicators (may be 'NULL' for 'svyCprod')

     psu: A vector of cluster indicators (may be 'NULL')

clusters: A vector of cluster indicators 

     fpc: A data frame ('svyCprod') or vector ('onestage') of
          population stratum sizes, or 'NULL'

    nPSU: Table ('svyprod') or vector ('onestage') of original sample
          stratum sizes (or 'NULL')

certainty: logical vector with stratum names as names. If 'TRUE' and
          that stratum has a single PSU it is a certainty PSU

postStrata: Post-stratification variables

lonely.psu: One of '"remove"', '"adjust"', '"fail"', '"certainty"',
          '"average"'. See Details below

   stage: Used internally to track the depth of recursion

     cal: Used to pass calibration information at stages below the
          population

_D_e_t_a_i_l_s:

     The observations for each cluster are added, then centered within
     each stratum and the outer product is taken of the row vector
     resulting for each cluster.  This is added within strata,
     multiplied by a degrees-of-freedom correction and by a finite
     population correction (if supplied) and added across strata.  

     If there are fewer clusters (PSUs) in a stratum than in the
     original design extra rows of zeroes are added to 'x' to allow the
     correct subpopulation variance to be computed.

     See 'postStratify' for information about post-stratification
     adjustments.

     The variance formula gives 0/0 if a stratum contains only one
     sampling unit. If the 'certainty' argument specifies that this is
     a PSU sampled with probability 1 (a "certainty" PSU) then it does
     not contribute to the variance (this is correct only when there is
     no  subsampling within the PSU - otherwise it should be defined as
     a  pseudo-stratum).  If 'certainty' is 'FALSE' for this stratum or
     is not supplied the result depends on 'lonely.psu'.

     The options are '"fail"' to give an error, '"remove"' or
     '"certainty"' to give a variance contribution of 0 for the
     stratum, '"adjust"' to center the stratum at the grand mean rather
     than the stratum mean, and '"average"' to assign strata with one
     PSU the average variance contribution from strata with more than
     one PSU.  The choice is controlled by setting
     'options(survey.lonely.psu)'. If this is not done the factory
     default is '"fail"'. Using '"adjust"' is conservative, and it
     would often be better to combine strata in some intelligent way.
     The properties of '"average"' have not been investigated
     thoroughly, but it may be useful when the lonely PSUs are due to a
     few strata having PSUs missing completely at random.

     The '"remove"'and '"certainty"' options give the same result, but
     '"certainty"' is intended for situations where there is only one
     PSU in the population stratum, which is sampled with certainty
     (also called `self-representing' PSUs or strata). With
     '"certainty"' no warning is generated for strata with only one
     PSU.  Ordinarily, 'svydesign' will detect certainty PSUs, making
     this option unnecessary.

     For strata with a single PSU in a subset (domain) the variance
     formula gives a value that is well-defined and positive, but not
     typically correct. If 'options("survey.adjust.domain.lonely")' is
     'TRUE' and 'options("survey.lonely.psu")' is '"adjust"' or
     '"average"', and no post-stratification or G-calibration has been
     done, strata with a single PSU in a subset will be treated like
     those with a single PSU in the sample.  I am not aware of any
     theoretical study of this procedure, but it should at least be
     conservative.

_V_a_l_u_e:

     A covariance matrix

_A_u_t_h_o_r(_s):

     Thomas Lumley

_R_e_f_e_r_e_n_c_e_s:

     Binder, David A. (1983).  On the variances of asymptotically
     normal estimators from complex surveys.  International Statistical
     Review, 51, 279- 292.

_S_e_e _A_l_s_o:

     'svydesign', 'svyrecvar', 'surveyoptions', 'postStratify'

