reldist               package:reldist               R Documentation

_I_n_f_e_r_e_n_c_e _f_o_r _R_e_l_a_t_i_v_e _D_i_s_t_r_i_b_u_t_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n:

     Estimate and graph relative distribution and density functions for
     continuous or discrete data.

_U_s_a_g_e:

     reldist(y, yo=FALSE, ywgt=FALSE,yowgt=FALSE,
       show="none", decomp="locadd",
       location="median", scale="IQR",
       rpmult=FALSE, 
       z=FALSE, zo=FALSE,
       smooth = 0.35, 
       quiet = TRUE, 
       cdfplot=FALSE,
       ci=FALSE,
       bar="no",
       add=FALSE,
       graph=TRUE, type="l",
       xlab="Reference proportion",ylab="Relative Density",yaxs="r",
       yolabs=pretty(yo), yolabslabs=NULL, 
       ylabs=pretty(y), ylabslabs=NULL,
       yolabsloc=0.6, ylabsloc=1, 
       ylim=NULL, cex=0.8, lty=1,
       binn=100,
       aicc=seq(0.0001, 5, length=30),
       deciles=(0:10)/10,
       discrete=FALSE,
       method="gam",
       ...)

_A_r_g_u_m_e_n_t_s:

       y: Sample from comparison distribution.

      yo: Sample from reference distribution.

discrete: Do 'y' and 'yo' refer to a  discrete distribution? If 'TRUE'
          a discrete estimator is  used instead of the default
          continuous one.

  smooth: Degree of smoothness required in the fit. Higher values lead
          to smoother curves, lower positive values lead to closer fits
          to the observed data.  If it is not specified the value that
          minimizes GCV is used. If a value less than zero is specified
          then the value is chosen to minimize a corrected AIC. If
          'discrete=TRUE' it is the minimum number of values to pool in
          the reference distribution in the probabiliy mass function
          estimate.

  method: Method used to estimate the relative density. The default
          ('"gam"') uses a local likelihood approach based on smoothed
          Poisson  regression. The option '"loclik"' uses log-splines.
          The option '"quick"' uses the Anscombe transformation to
          stabilize variances. In versions prior to 1.3 the '"quick"'
          approach was used.

   graph: Graph the results on the current device.

     bar: Graph the deciles on the current device.  Possible values of
          'bar' are '"no"' (no deciles plotted), '"yes"' (deciles
          plotted with the non-parametric fit, '"only"'(deciles plotted
          without non-parametric fit).

     add: Add the density to the current plot?

    ylim: plotting limit for the vertical axis.

     lty: Line type to be used for the density.

    xlab: Horizontal label.

    ylab: Vertical label.

   ylabs: Locations for label to be added to the right axis.

ylabslabs: Labels indicating the original scale for the comparison
          distribution.

ylabsloc: Distance of labels to right of axis (in lines).

  yolabs: locations for labels to be added to the tip axis.

yolabslabs: Labels indicating the original scale for the reference
          distribution.

yolabsloc: Distance of labels above axis (in lines).

    yaxs: Style of vertical axis.

 cdfplot: calculate and plot the CDF rather than the density.

   quiet: Should the output be returned invisibly?

      ci: Plot (pointwise) 95% confidence intervals?

    ywgt: Weights on the comparison sample.

   yowgt: Weights on the reference sample.

       z: Covariate on the comparison sample to be used to adjust it to
          the reference distribution. Only used if the form of matching
           specified in 'decomp="covariate"'.

      zo: Covariate on the reference sample to be used in the
          adjustment.  to the reference distribution. Only used if the
          form of matching specified in 'decomp="covariate"'.

    show: Type of relative distribution to produce.  Possible values
          are '"none"' (comparions to reference), 'residual'
          (location-matched reference to reference), 'effect'
          (comparison to location-matched reference).

  decomp: Form of matching to the comparison sample.  Possible values
          are 'locmult' (multiplicatively scale the reference),
          'locadd' (additively shift the reference), 'lsadd'
          (location/scale additive shift), 'covariate' (covariate
          adjust the refernce (requires z and zo to be specified)).

location: How to measure location.  Possible values are '"mean"' and
          '"median"'.

   scale: How to measure the scale.  Possible values are '"standev"'
          (standard deviation) and 'IQR' (interquartile range).

  rpmult: Only in calculation of polarization indices: multiplicatively
          scale the reference sample to the comparison sample before
          comparing the two distributions?

    binn: Number of bins used in the smoother.

 deciles: The percentiles used for the histogram bins. Typically
          deciles (i.e., 0.0, 0.1, 0.2,...,0.9, 1.0), but any set can
          be used (e.g., quintiles, terciles).

    aicc: Values of the smoothing parameter to search over in
          minimizing the corrected AIC. Only used if  'method="gam"'
          and 'smooth' is less than 0.

    type: Type of plot to use. See 'par()'.

     cex: Character expansion to use in plots. See 'par()'.

     ...: Additional arguments to the plot functions. See 'par()'.

_V_a_l_u_e:

       x: Horizontal coordinates for the density (typically
          percentages).

       y: Density at x.

      rp: 95% confidence interval for the median relative polarization
          as lower bound, estimate, upper bound.

     rpl: 95% confidence interval for the lower relative polarization
          as lower bound, estimate, upper bound.

     rpu: 95% confidence interval for the upper relative polarization
          as lower bound, estimate, upper bound.

     cdf: x coordinates for the CDF (typically percentages) and y CDF
          at x.

_N_o_t_e:

     Most of the code is for the plotting and tinkering.  The guts of
     the method are forming the relative data at the top.  The rest is
     a standard fixed interval density estimation with a few bells and
     whistles.

_R_e_f_e_r_e_n_c_e_s:

     For more examples see the tech report by Handcock & Aldrich (2002)
     available at <URL: http://www.csss.washington.edu/Papers>

_E_x_a_m_p_l_e_s:

     #
     # First load the data.
     #

     data(nls, package="reldist")

     #
     # A simple example comparing permanent wages of the original to the
     # recent cohort in the NLS.  See H&M (1999) for details.

     reldist(y=recent$chpermwage,yo=original$chpermwage)

     #
     # A more sophisticated version of the same.
     #

     reldist(y=recent$chpermwage, yo=original$chpermwage,
             yowgt=original$wgt, ywgt=recent$wgt,      
             bar=TRUE,                                   
             smooth=0.1,                              
             yolabs=seq(-1, 3, by=0.5),                 
             ylim=c(0, 3.0),cex=0.8,                   
             ylab="Relative Density",                 
             xlab="Proportion of the Original Cohort")

     #
     # A CDF version.
     #

     reldist(y=recent$chpermwage, yo=original$chpermwage,
         yowgt=original$wgt, ywgt=recent$wgt,      
         cdfplot=TRUE,                               
         smooth=0.4,                              
         yolabs=seq(-1,3,by=0.5),                 
         ylabs=seq(-1,3,by=0.5),                  
         cex=0.8,                                 
         ylab="proportion of the recent cohort",  
         xlab="proportion of the original cohort")

