rake                 package:survey                 R Documentation

_R_a_k_i_n_g _o_f _r_e_p_l_i_c_a_t_e _w_e_i_g_h_t _d_e_s_i_g_n

_D_e_s_c_r_i_p_t_i_o_n:

     Raking uses iterative post-stratification to match marginal
     distributions of a survey sample to known population margins.

_U_s_a_g_e:

     rake(design, sample.margins, population.margins, control = list(maxit =
     10, epsilon = 1, verbose=FALSE), compress=NULL)

_A_r_g_u_m_e_n_t_s:

  design: A survey object 

sample.margins: list of formulas or data frames describing sample
          margins

population.margins: list of tables or data frames describing
          corresponding population margins 

 control: 'maxit' controls the number of iterations. Convergence is
          declared if the maximum change in a table entry is less than 
          'epsilon'. If 'epsilon<1' it is taken to be a fraction of the
          total sampling weight. 

compress: If 'design' has replicate weights, attempt to compress the
          new replicate weight matrix? When 'NULL', will attempt to
          compress if the original weight matrix was compressed

_D_e_t_a_i_l_s:

     The 'sample.margins' should be in a format suitable for
     'postStratify'.

     Raking (aka iterative proportional fitting) is known to converge
     for any table without zeros, and for any table with zeros for
     which there is a joint distribution with the given margins and the
     same pattern of zeros.  The `margins' need not be one-dimensional.

     The algorithm works by repeated calls to 'postStratify' (iterative
     proportional fitting), which is efficient for large multiway
     tables. For small tables 'calibrate' will be faster, and also
     allows raking to population totals for continuous variables, and
     raking with bounded weights.

_V_a_l_u_e:

     A raked survey design.

_S_e_e _A_l_s_o:

     'postStratify', 'compressWeights'

     'calibrate' for other ways to use auxiliary information.

_E_x_a_m_p_l_e_s:

     data(api)
     dclus1 <- svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
     rclus1 <- as.svrepdesign(dclus1)

     svymean(~api00, rclus1)
     svytotal(~enroll, rclus1)

     ## population marginal totals for each stratum
     pop.types <- data.frame(stype=c("E","H","M"), Freq=c(4421,755,1018))
     pop.schwide <- data.frame(sch.wide=c("No","Yes"), Freq=c(1072,5122))

     rclus1r <- rake(rclus1, list(~stype,~sch.wide), list(pop.types, pop.schwide))

     svymean(~api00, rclus1r)
     svytotal(~enroll, rclus1r)

     ## marginal totals correspond to population
     xtabs(~stype, apipop)
     svytable(~stype, rclus1r, round=TRUE)
     xtabs(~sch.wide, apipop)
     svytable(~sch.wide, rclus1r, round=TRUE)

     ## joint totals don't correspond 
     xtabs(~stype+sch.wide, apipop)
     svytable(~stype+sch.wide, rclus1r, round=TRUE)

     ## Do it for a design without replicate weights
     dclus1r<-rake(dclus1, list(~stype,~sch.wide), list(pop.types, pop.schwide))

     svymean(~api00, dclus1r)
     svytotal(~enroll, dclus1r)

     ## compare to raking with calibrate()
     dclus1gr<-calibrate(dclus1, ~stype+sch.wide, pop=c(6194, 755,1018,5122),
                calfun="raking")
     svymean(~stype+api00, dclus1r)
     svymean(~stype+api00, dclus1gr)

     ## compare to joint post-stratification
     ## (only possible if joint population table is known)
     ##
     pop.table <- xtabs(~stype+sch.wide,apipop)
     rclus1ps <- postStratify(rclus1, ~stype+sch.wide, pop.table)
     svytable(~stype+sch.wide, rclus1ps, round=TRUE)

     svymean(~api00, rclus1ps)
     svytotal(~enroll, rclus1ps)

     ## Example of raking with partial joint distributions
     pop.imp<-data.frame(comp.imp=c("No","Yes"),Freq=c(1712,4482))
     dclus1r2<-rake(dclus1, list(~stype+sch.wide, ~comp.imp),
                    list(pop.table, pop.imp))
     svymean(~api00, dclus1r2)

