svrepdesign              package:survey              R Documentation

_S_p_e_c_i_f_y _s_u_r_v_e_y _d_e_s_i_g_n _w_i_t_h _r_e_p_l_i_c_a_t_e _w_e_i_g_h_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     Some recent large-scale surveys specify replication weights rather
     than the sampling design (partly for privacy reasons).  This
     function specifies the data structure for such a survey.

_U_s_a_g_e:

     svrepdesign(variables , repweights , weights, data,...)
     ## Default S3 method:
     svrepdesign(variables = NULL, repweights = NULL, weights = NULL, data =
     NULL, type = c("BRR", "Fay", "JK1","JKn","bootstrap","other"),
     combined.weights=TRUE, rho = NULL, bootstrap.average=NULL,
     scale=NULL, rscales=NULL,fpc=NULL, fpctype=c("fraction","correction"),...)
     ## S3 method for class 'imputationList':
     svrepdesign(variables=NULL, repweights,weights,data,...)
     ## S3 method for class 'svyrep.design':
     image(x, ..., col=grey(seq(.5,1,length=30)), type.=c("rep","total"))
     ## S3 method for class 'character':
     svrepdesign(variables=NULL,repweights=NULL, weights=NULL,data=NULL,
     type=c("BRR","Fay","JK1", "JKn","bootstrap","other"),combined.weights=TRUE, rho=NULL, 
     bootstrap.average=NULL, scale=NULL,rscales=NULL,fpc=NULL,
     fpctype=c("fraction","correction"), dbtype="SQLite", dbname,...) 

_A_r_g_u_m_e_n_t_s:

variables: formula or data frame specifying variables to include in the
          design (default is all) 

repweights: formula or data frame specifying replication weights, or
          character string specifying a regular expression that matches
          the names of the replication weight variables 

 weights: sampling weights 

    data: data frame to look up variables in formulas, or character
          string giving name of database table

    type: Type of replication weights

combined.weights: 'TRUE' if the 'repweights' already include the
          sampling weights

     rho: Shrinkage factor for weights in Fay's method

bootstrap.average: For 'type="bootstrap"', if the bootstrap weights
          have been averaged, gives the number of iterations averaged
          over

scale, rscales: Scaling constant for variance, see Details below

fpc,fpctype: Finite population correction information

  dbname: name of database, passed to 'DBI::dbConnect()'

  dbtype: Database driver: see Details

       x: survey design with replicate weights

     ...: Other arguments to 'image'

     col: Colors

   type.: '"rep"' for only the replicate weights, '"total"' for the
          replicate and sampling weights combined.

_D_e_t_a_i_l_s:

     In the BRR method, the dataset is split into halves, and the
     difference between halves is used to estimate the variance. In
     Fay's method, rather than removing observations from half the
     sample they are given weight 'rho' in one half-sample and '2-rho'
     in the other.  The ideal BRR analysis is restricted to a design
     where each stratum has two PSUs, however, it has been used in a
     much wider class of surveys.

     The JK1 and JKn types are both jackknife estimators deleting one
     cluster at a time. JKn is designed for stratified and JK1 for
     unstratified designs.

     Averaged bootstrap weights ("mean bootstrap") are used for some
     surveys from Statistics Canada. Yee et al (1999) describe their
     construction and use for one such survey.

     The variance is computed as the sum of squared deviations of the
     replicates from their mean.  This may be rescaled: 'scale' is an
     overall multiplier and 'rscale' is a vector of replicate-specific
     multipliers for the squared deviations.  If the replication
     weights incorporate the sampling weights ('combined.weights=TRUE')
     or for 'type="other"' these must be specified, otherwise they can
     be guessed from the weights.

     A finite population correction may be specified for
     'type="other"', 'type="JK1"' and 'type="JKn"'.  'fpc' must be a
     vector with one entry for each replicate. To specify sampling
     fractions use 'fpctype="fraction"' and to specify the correction
     directly use 'fpctype="correction"'

     'repweights' may be a character string giving a regular expression
     for the replicate weight variables. For example, in the California
     Health Interview Survey public-use data, the sampling weights are
     '"rakedw0"' and the replicate weights are '"rakedw1"' to
     '"rakedw80"'.  The regular expression '"rakedw[1-9]"' matches the
     replicate weight variables (and not the sampling weight variable).

     'data' may be a character string giving the name of a table or
     view in a relational database that can be accessed through the
     'DBI' or 'ODBC' interfaces. For DBI interfaces 'dbtype' should be
     the name of the database driver and 'dbname' should be the name by
     which the driver identifies the specific database (eg file name
     for SQLite). For ODBC databases 'dbtype' should be '"ODBC"' and
     'dbname' should be the registed DSN for the database. On the
     Windows GUI, 'dbname=""' will produce a dialog box for interactive
     selection. 

     The appropriate database interface package must already be loaded
     (eg 'RSQLite' for SQLite, 'RODBC' for ODBC).  The survey design
     object will contain the replicate weights, but actual variables
     will be loaded from the database only as needed.  Use 'close' to
     close the database connection and 'open' to reopen the connection,
     eg, after loading a saved object.

     The database interface does not attempt to modify the underlying
     database and so can be used with read-only permissions on the
     database.

     To generate your own replicate weights either use 'as.svrepdesign'
     on a 'survey.design' object, or see 'brrweights', 'bootweights',
     'jk1weights' and 'jknweights'

     The 'model.frame' method extracts the observed data.

_V_a_l_u_e:

     Object of class 'svyrep.design', with methods for 'print',
     'summary', 'weights', 'image'.

_N_o_t_e:

     To use replication-weight analyses on a survey specified by
     sampling design, use 'as.svrepdesign' to convert it.

_R_e_f_e_r_e_n_c_e_s:

     Levy and Lemeshow. "Sampling of Populations". Wiley.

     Shao and Tu. "The Jackknife and Bootstrap." Springer.

     Yee et al (1999). Bootstrat Variance Estimation for the National
     Population Health Survey. Proceedings of the ASA Survey Research
     Methodology Section. <URL: 
     http://www.amstat.org/Sections/Srms/Proceedings/papers/1999_136.pdf>

_S_e_e _A_l_s_o:

     'as.svrepdesign', 'svydesign', 'brrweights', 'bootweights'

_E_x_a_m_p_l_e_s:

     data(scd)
     # use BRR replicate weights from Levy and Lemeshow
     repweights<-2*cbind(c(1,0,1,0,1,0), c(1,0,0,1,0,1), c(0,1,1,0,0,1),
     c(0,1,0,1,1,0))
     scdrep<-svrepdesign(data=scd, type="BRR", repweights=repweights, combined.weights=FALSE)
     svyratio(~alive, ~arrests, scdrep)

     ## Not run: 
     ## Needs RSQLite
     library(RSQLite)
     db_rclus1<-svrepdesign(weights=~pw, repweights="wt[1-9]+", type="JK1", scale=(1-15/757)*14/15,
     data="apiclus1rep",dbtype="SQLite", dbname=system.file("api.db",package="survey"))
     svymean(~api00+api99,db_rclus1)

     summary(db_rclus1)

     ## closing and re-opening a connection
     close(db_rclus1)
     db_rclus1
     try(svymean(~api00+api99,db_rclus1))
     db_rclus1<-open(db_rclus1)
     svymean(~api00+api99,db_rclus1)


     ## End(Not run)

