svyquantile              package:survey              R Documentation

_Q_u_a_n_t_i_l_e_s _f_o_r _s_a_m_p_l_e _s_u_r_v_e_y_s

_D_e_s_c_r_i_p_t_i_o_n:

     Compute quantiles for data from complex surveys.

_U_s_a_g_e:

     ## S3 method for class 'survey.design':
     svyquantile(x, design, quantiles, alpha=0.05,
        ci=FALSE, method = "linear", f = 1,
        interval.type=c("Wald","score","betaWald"), na.rm=FALSE,se=ci,
        ties=c("discrete","rounded"), df=Inf,...)
     ## S3 method for class 'svyrep.design':
     svyquantile(x, design, quantiles,
        method ="linear", interval.type=c("probability","quantile"), f = 1,
        return.replicates=FALSE, ties=c("discrete","rounded"),...)
     ## S3 method for class 'svyquantile':
     SE(object,...)

_A_r_g_u_m_e_n_t_s:

       x: A formula, vector or matrix

  design: 'survey.design' or 'svyrep.design' object

quantiles: Quantiles to estimate

  method: see 'approxfun'

       f: see 'approxfun'

      ci: Compute a confidence interval (relatively slow)?

      se: Compute standard errors from the confidence interval length?

   alpha: Level for confidence interval

interval.type: See Details below

    ties: See Details below

      df: Degrees of freedom for a t-distribution. 'Inf' requests a
          Normal distribution, 'NULL' uses 'degf'. Not relevant for
          'type="betaWald"'

return.replicates: Return the replicate means?

   na.rm: Remove 'NA's?

     ...: arguments for future expansion

  object: Object returned by 'svyquantile.survey.design'

_D_e_t_a_i_l_s:

     The definition of the CDF and thus of the quantiles is ambiguous
     in the presence of ties.  With 'ties="discrete"' the data are
     treated as genuinely discrete, so the CDF has vertical steps at
     tied observations. With 'ties="rounded"' all the weights for tied
     observations are summed and the CDF interpolates linearly between
     distinct observed values, and so is a continuous function. 
     Combining 'interval.type="betaWald"' and 'ties="discrete"' is
     (close to) the proposal of Shah and Vaish(2006) used in some
     versions of SUDAAN.

     Interval estimation for quantiles is complicated, because the
     influence function is not continuous.  Linearisation cannot be
     used directly, and computing the variance of replicates is valid
     only for some designs (eg BRR, but not jackknife). The
     'interval.type' option controls how the intervals are computed.

     For 'survey.design' objects the default is 'interval.type="Wald"'.
     A 95% Wald confidence interval is constructed for the proportion
     below the estimated quantile. The inverse of the estimated CDF is
     used to map this to a confidence interval for the quantile. This
     is the method of Woodruff (1952). For '"betaWald"' the same
     procedure is used, but the confidence interval for the proportion
     is computed using the exact binomial cdf with an effective sample
     size proposed by Korn & Graubard (1998).

     If 'interval.type="score"' we use a method described by Binder
     (1991) and due originally to Francisco and Fuller (1986), which
     corresponds to inverting a robust score test.  At the upper and
     lower limits of the confidence interval, a test of the null
     hypothesis that the cumulative distribution function is equal to
     the target quantile just rejects.  This was the default before
     version 2.9. It is much slower than '"Wald"', and Dorfman &
     Valliant (1993) suggest it is not any more accurate.

     Standard errors are computed from these confidence intervals by
     dividing the confidence interval length by '2*qnorm(alpha/2)'.

     For replicate-weight designs, ordinary replication-based standard
     errors are valid for BRR and Fay's method, and for some
     bootstrap-based designs, but not for jackknife-based designs.
     'interval.type="quantile"' gives these replication-based standard
     errors.  The default, 'interval.type="probability"' computes
     confidence on the probability scale and then transforms back to
     quantiles, the equivalent of 'interval.type="Wald"' for
     'survey.design' objects (with 'alpha=0.05').

     There is a 'confint' method for 'svyquantile' objects; it simply
     extracts the pre-computed confidence interval.

_V_a_l_u_e:

     returns a list whose first component is the quantiles and second
     component is the confidence intervals. For replicate weight
     designs, returns an object of class 'svyrepstat'.

_A_u_t_h_o_r(_s):

     Thomas Lumley

_R_e_f_e_r_e_n_c_e_s:

     Binder DA (1991) Use of estimating functions for interval
     estimation from complex surveys. _Proceedings of the ASA Survey
     Research Methods Section_  1991: 34-42

     Dorfman A, Valliant R (1993) Quantile variance estimators in
     complex surveys. Proceedings of the ASA Survey Research Methods
     Section. 1993: 866-871

     Korn EL, Graubard BI. (1998) Confidence Intervals For Proportions
     With Small Expected Number of Positive Counts Estimated From
     Survey Data. Survey Methodology 23:193-201.

     Francisco CA, Fuller WA (1986) Estimation of the distribution
     function with a complex survey. Technical Report, Iowa State
     University.

     Shao J, Tu D (1995) _The Jackknife and Bootstrap_. Springer.

     Shah BV, Vaish AK (2006) Confidence Intervals for Quantile
     Estimation from Complex Survey Data. Proceedings of the Section on
     Survey Research Methods. 

     Woodruff RS (1952) Confidence intervals for medians and other
     position measures. JASA 57, 622-627.

_S_e_e _A_l_s_o:

     'svykm' for quantiles of survival curves

     'svyciprop' for confidence intervals on proportions.

_E_x_a_m_p_l_e_s:

       data(api)
       ## population
       quantile(apipop$api00,c(.25,.5,.75))

       ## one-stage cluster sample
       dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
       svyquantile(~api00, dclus1, c(.25,.5,.75),ci=TRUE)
       svyquantile(~api00, dclus1, c(.25,.5,.75),ci=TRUE,interval.type="betaWald")
       svyquantile(~api00, dclus1, c(.25,.5,.75),ci=TRUE,df=NULL)

       dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)
       (qapi<-svyquantile(~api00, dclus1, c(.25,.5,.75),ci=TRUE, interval.type="score"))
       SE(qapi)

       #stratified sample
       dstrat<-svydesign(id=~1, strata=~stype, weights=~pw, data=apistrat, fpc=~fpc)
       svyquantile(~api00, dstrat, c(.25,.5,.75),ci=TRUE)

       #stratified sample, replicate weights
       # interval="probability" is necessary for jackknife weights
       rstrat<-as.svrepdesign(dstrat)
       svyquantile(~api00, rstrat, c(.25,.5,.75), interval.type="probability")

       # BRR method
       data(scd)
       repweights<-2*cbind(c(1,0,1,0,1,0), c(1,0,0,1,0,1), c(0,1,1,0,0,1),
                   c(0,1,0,1,1,0))
       scdrep<-svrepdesign(data=scd, type="BRR", repweights=repweights)
       svyquantile(~arrests+alive, design=scdrep, quantile=0.5, interval.type="quantile")

      

