svyglm                package:survey                R Documentation

_S_u_r_v_e_y-_w_e_i_g_h_t_e_d _g_e_n_e_r_a_l_i_s_e_d _l_i_n_e_a_r _m_o_d_e_l_s.

_D_e_s_c_r_i_p_t_i_o_n:

     Fit a generalised linear model to data from a complex survey
     design, with inverse-probability weighting and design-based
     standard errors.

_U_s_a_g_e:

     ## S3 method for class 'survey.design':
     svyglm(formula, design, subset=NULL, ...)
     ## S3 method for class 'svyrep.design':
     svyglm(formula, design, subset=NULL, ..., rho=NULL,
     return.replicates=FALSE, na.action,multicore=getOption("survey.multicore"))
     ## S3 method for class 'svyglm':
     summary(object, correlation = FALSE, df.resid=NULL,
     ...)
     ## S3 method for class 'svyglm':
     predict(object,newdata=NULL,total=NULL,
                              type=c("link","response","terms"),
                              se.fit=(type != "terms"),vcov=FALSE,...)

_A_r_g_u_m_e_n_t_s:

 formula: Model formula

  design: Survey design from 'svydesign' or 'svrepdesign'. Must contain
          all variables in the formula

  subset: Expression to select a subpopulation

     ...: Other arguments passed to 'glm' or 'summary.glm' 

     rho: For replicate BRR designs, to specify the parameter for Fay's
          variance method, giving weights of 'rho' and '2-rho'

return.replicates: Return the replicates as a component of the result?

  object: A 'svyglm' object

correlation: Include the correlation matrix of parameters?

na.action: Handling of NAs

multicore: Use the 'multicore' package to distribute replicates across
          processors?

df.resid: Optional denominator degrees of freedom for Wald tests

 newdata: new data frame for prediction

   total: population size when predicting population total

    type: linear predictor ('link') or response

  se.fit: if 'TRUE', return variances of predictions

    vcov: if 'TRUE' and 'se=TRUE' return full variance-covariance
          matrix of predictions

_D_e_t_a_i_l_s:

     There is no 'anova' method for 'svyglm' as the models are not
     fitted by maximum likelihood. The function 'regTermTest' may be
     useful for testing sets of regression terms.

     For binomial and Poisson families use 'family=quasibinomial()' and
     'family=quasipoisson()' to avoid a warning about non-integer
     numbers of successes.  The `quasi' versions of the family objects
     give the same point estimates and standard errors and do not give
     the warning.

     If 'df.resid' is not specified the df for the null model is
     computed by 'degf' and the residual df computed by subtraction. 
     It's not that these are particularly good approximations in a
     regression model but they are relatively standard. To get tests
     based on a Normal distribution use 'df.resid=Inf'.

     Parallel processing with 'multicore=TRUE' is helpful only for
     fairly large data sets and on computers with sufficient memory. It
     may be incompatible with GUIs, although the Mac Aqua GUI appears
     to be safe.

     'predict' gives fitted values and sampling variability for
     specific new values of covariates.  When 'newdata' are the
     population mean it gives the regression estimator of the mean, and
     when 'newdata' are the population totals and 'total' is specified
     it gives the regression estimator of the population total. 
     Regression estimators of mean and total can also be obtained with
     'calibrate'.

_V_a_l_u_e:

     'svyglm' returns an object of class 'svyglm'.  The 'predict'
     method returns an object of class 'svystat'

_A_u_t_h_o_r(_s):

     Thomas Lumley

_S_e_e _A_l_s_o:

     'glm', which is used to do most of the work.

     'regTermTest', for multiparameter tests

     'calibrate', for an alternative way to specify regression
     estimators of population totals or means

     'svyttest' for one-sample and two-sample t-tests.

_E_x_a_m_p_l_e_s:

       data(api)

       dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
       dclus2<-svydesign(id=~dnum+snum, weights=~pw, data=apiclus2)
       rstrat<-as.svrepdesign(dstrat)
       rclus2<-as.svrepdesign(dclus2)

       summary(svyglm(api00~ell+meals+mobility, design=dstrat))
       summary(svyglm(api00~ell+meals+mobility, design=dclus2))
       summary(svyglm(api00~ell+meals+mobility, design=rstrat))
       summary(svyglm(api00~ell+meals+mobility, design=rclus2))

       ## use quasibinomial, quasipoisson to avoid warning messages
       summary(svyglm(sch.wide~ell+meals+mobility, design=dstrat,
             family=quasibinomial()))

       ## Compare regression and ratio estimation of totals
       api.ratio <- svyratio(~api.stu,~enroll, design=dstrat)
       pop<-data.frame(enroll=sum(apipop$enroll, na.rm=TRUE))
       npop <- nrow(apipop)
       predict(api.ratio, pop$enroll)

       ## regression estimator is less efficient
       api.reg <- svyglm(api.stu~enroll, design=dstrat)
       predict(api.reg, newdata=pop, total=npop)
       ## same as calibration estimator
       svytotal(~api.stu, calibrate(dstrat, ~enroll, pop=c(npop, pop$enroll)))

       ## svyglm can also reproduce the ratio estimator
       api.reg2 <- svyglm(api.stu~enroll-1, design=dstrat,
                         family=quasi(link="identity",var="mu"))
       predict(api.reg2, newdata=pop, total=npop)

       ## higher efficiency by modelling variance better
       api.reg3 <- svyglm(api.stu~enroll-1, design=dstrat,
                         family=quasi(link="identity",var="mu^3"))
       predict(api.reg3, newdata=pop, total=npop)
       ## true value
       sum(apipop$api.stu)

      

