Overview                package:Hmisc                R Documentation

_O_v_e_r_v_i_e_w _o_f _H_m_i_s_c _L_i_b_r_a_r_y

_D_e_s_c_r_i_p_t_i_o_n:

     The Hmisc library contains many functions useful for data
     analysis, high-level graphics, utility operations, functions for
     computing sample size and power, translating SAS datasets into S,
     imputing missing values, advanced table making, variable
     clustering, character string manipulation, conversion of S objects
     to LaTeX code, recoding variables, and bootstrap repeated measures
     analysis.  Most of these functions were written by F Harrell, but
     a few were collected from statlib and from s-news; other authors
     are indicated below.  This collection of functions includes all of
      Harrell's submissions to statlib other than  the functions in the
     Design and display  libraries.  A few of the functions do not 
     have "Help" documentation.

_F_u_n_c_t_i_o_n_s:


       *Function Name*      *Purpose*
       abs.error.pred       Computes various indexes of predictive accuracy based
                            on absolute errors, for linear models
       all.is.numeric       Check if character strings are legal numerics
       approxExtrap         Linear extrapolation
       aregImpute           Multiple imputation based on additive regression,
                            bootstrapping, and predictive mean matching
       areg.boot            Nonparametrically estimate transformations for both
                            sides of a multiple additive regression, and
                            bootstrap these estimates and R^2
       ballocation          Optimum sample allocations in 2-sample proportion test
       binconf              Exact confidence limits for a proportion and more accurate
                            (narrower!) score stat.-based Wilson interval
                            (Rollin Brant, mod. FEH)
       bootkm               Bootstrap Kaplan-Meier survival or quantile estimates
       bpower               Approximate power of 2-sided test for 2 proportions
                            Includes bpower.sim for exact power by simulation
       bpplot               Box-Percentile plot
                            (Jeffrey Banfield, umsfjban@bill.oscs.montana.edu)
       bsamsize             Sample size requirements for test of 2 proportions
       bystats              Statistics on a single variable by levels of >=1 factors
       bystats2             2-way statistics
       calltree             Calling tree of functions
                            (David Lubinsky, david@hoqax.att.com)
       character.table      Shows numeric equivalents of all latin characters
                            Useful for putting many special chars. in graph titles
                            (Pierre Joyet, pierre.joyet@bluewin.ch)
       ciapower             Power of Cox interaction test
       cleanup.import       More compactly store variables in a data frame, and clean up
                            problem data when e.g. Excel spreadsheet had a non-
                            numeric value in a numeric column
       combine.levels       Combine infrequent levels of a categorical variable
       comment              Attach a comment attribute to an object:
                            comment(fit) <- 'Used old data'
                            comment(fit)    (prints comment)
       confbar              Draws confidence bars on an existing plot using multiple
                            confidence levels distinguished using color or gray scale
       contents             Print the contents (variables, labels, etc.) of a data frame
       cpower               Power of Cox 2-sample test allowing for noncompliance
       Cs                   Vector of character strings from list of unquoted names
       csv.get              Enhanced importing of comma separated files labels
       cut2                 Like cut with better endpoint label construction and allows
                            construction of quantile groups or groups with given n
       datadensity          Snapshot graph of distributions of all variables in
                            a data frame.  For continuous variables uses scat1d.
       dataRep              Quantify representation of new observations in a database
       ddmmmyy              SAS "date7" output format for a chron object
       deff                 Kish design effect and intra-cluster correlation
       describe             Function to describe different classes of objects.
                            Invoke by saying describe(object). It calls one of the
                            following:
       describe.data.frame  Describe all variables in a data frame (generalization
                            of SAS UNIVARIATE)
       describe.default     Describe a variable (generalization of SAS UNIVARIATE)
       do                   Assists with batch analyses
       dot.chart            Dot chart for one or two classification variables
       Dotplot              Enhancement of Trellis dotplot allowing for matrix
                            x-var., auto generation of Key function, superposition
       drawPlot             Simple mouse-driven drawing program, including a function
                            for fitting Bezier curves
       ecdf                 Empirical cumulative distribution function plot
       eip                  Edit an object "in-place" (may be dangerous!), e.g.
                            eip(sqrt) will replace the builtin sqrt function
       errbar               Plot with error bars (Charles Geyer, U. Chi., mod FEH)
       event.chart          Plot general event charts (Jack Lee, jjlee@mdanderson.org,
                            Ken Hess, Joel Dubin; Am Statistician 54:63-70,2000)
       event.history        Event history chart with time-dependent cov. status
                            (Joel Dubin, joel.dubin@yale.edu)
       find.matches         Find matches (with tolerances) between columns of 2 matrices
       first.word           Find the first word in an S expression (R Heiberger)
       fit.mult.impute      Fit most regression models over multiple transcan imputations,
                            compute imputation-adjusted variances and avg. betas
       format.df            Format a matrix or data frame with much user control
                            (R Heiberger and FE Harrell)
       ftupwr               Power of 2-sample binomial test using Fleiss, Tytun, Ury
       ftuss                Sample size for 2-sample binomial test using  "  "  "  "
                            (Both by Dan Heitjan, dheitjan@biostats.hmc.psu.edu)
       gbayes               Bayesian posterior and predictive distributions when both
                            the prior and the likelihood are Gaussian
       getHdata             Fetch and list datasets on our web site
       gs.slide             Sets nice defaults for graph sheets for S-Plus 2000 for
                            copying graphs into Microsoft applications
       hdquantile           Harrell-Davis nonparametric quantile estimator with s.e.
       histbackback         Back-to-back histograms (Pat Burns, Salomon Smith
                            Barney, London, pburns@dorado.sbi.com)
       hist.data.frame      Matrix of histograms for all numeric vars. in data frame
                            Use hist.data.frame(data.frame.name)
       histSpike            Add high-resolution spike histograms or density estimates
                            to an existing plot
       hoeffd               Hoeffding's D test (omnibus test of independence of X and Y)
       impute               Impute missing data (generic method)
       interaction          More flexible version of builtin function
       is.present           Tests for non-blank character values or non-NA numeric values
       james.stein          James-Stein shrinkage estimates of cell means from raw data
       labcurve             Optimally label a set of curves that have been drawn on
                            an existing plot, on the basis of gaps between curves.
                            Also position legends automatically at emptiest rectangle.
       label                Set or fetch a label for an S-object
       Lag                  Lag a vector, padding on the left with NA or ''
       latex                Convert an S object to LaTeX (R Heiberger & FE Harrell)
       ldBands              Lan-DeMets bands for group sequential tests
       list.tree            Pretty-print the structure of any data object
                            (Alan Zaslavsky, zaslavsk@hcp.med.harvard.edu)
       mask                 8-bit logical representation of a short integer value
                            (Rick Becker)
       matchCases           Match each case on one continuous variable
       matxv                Fast matrix * vector, handling intercept(s) and NAs
       mem                  mem() types quick summary of memory used during session
       mgp.axis             Version of axis() that uses appropriate mgp from
                            mgp.axis.labels and gets around bug in axis(2, ...)
                            that causes it to assume las=1
       mgp.axis.labels      Used by survplot and plot in Design library (and other
                            functions in the future) so that different spacing
                            between tick marks and axis tick mark labels may be
                            specified for x- and y-axes.  ps.slide, win.slide,
                            gs.slide set up nice defaults for mgp.axis.labels.
                            Otherwise use mgp.axis.labels('default') to set defaults.
                            Users can set values manually using
                            mgp.axis.labels(x,y) where x and y are 2nd value of
                            par('mgp') to use.  Use mgp.axis.labels(type=w) to
                            retrieve values, where w='x', 'y', 'x and y', 'xy',
                            to get 3 mgp values (first 3 types) or 2 mgp.axis.labels.
       minor.tick           Add minor tick marks to an existing plot
       mtitle               Add outer titles and subtitles to a multiple plot layout
       nomiss               Return a matrix after excluding any row with an NA
       panel.bpplot         Panel function for trellis bwplot - box-percentile plots
       panel.plsmo          Panel function for trellis xyplot - uses plsmo
       pc1                  Compute first prin. component and get coefficients on
                            original scale of variables
       plotCorrPrecision    Plot precision of estimate of correlation coefficient
       plsmo                Plot smoothed x vs. y with labeling and exclusion of NAs
                            Also allows a grouping variable and plots unsmoothed data
       popower              Power and sample size calculations for ordinal responses
                            (two treatments, proportional odds model)
       prn                  prn(expression) does print(expression) but titles the
                            output with 'expression'.  Do prn(expression,txt) to add
                            a heading ('txt') before the 'expression' title
       p.sunflowers         Sunflower plots (Andreas Ruckstuhl, Werner Stahel,
                            Martin Maechler, Tim Hesterberg)
       ps.slide             Set up postcript() using nice defaults for different types
                            of graphics media
       pstamp               Stamp a plot with date in lower right corner (pstamp())
                            Add ,pwd=T and/or ,time=T to add current directory
                            name or time
                            Put additional text for label as first argument, e.g.
                            pstamp('Figure 1')  will draw 'Figure 1  date'
       putKey               Different way to use key()
       putKeyEmpty          Put key at most empty part of existing plot
       rcorr                Pearson or Spearman correlation matrix with pairwise deletion
                            of missing data
       rcorr.cens           Somers' Dyx rank correlation with censored data
       rcorrp.cens          Assess difference in concordance for paired predictors
       rcspline.eval        Evaluate restricted cubic spline design matrix
       rcspline.plot        Plot spline fit with nonparametric smooth and grouped estimates
       rcspline.restate     Restate restricted cubic spline in unrestricted form, and
                            create TeX expression to print the fitted function
       recode               Recodes variables
       reShape              Reshape a matrix into 3 vectors, reshape serial data
       rm.boot              Bootstrap spline fit to repeated measurements model,
                            with simultaneous confidence region - least
                            squares using spline function in time
       rMultinom            Generate multinomial random variables with varying prob.
       samplesize.bin       Sample size for 2-sample binomial problem
                            (Rick Chappell, chappell@stat.wisc.edu)
       sas.get              Convert SAS dataset to S data frame
       sasxport.get         Enhanced importing of SAS transport dataset in R
       scat1d               Add 1-dimensional scatterplot to an axis of an existing plot
                            (like bar-codes, FEH/Martin Maechler,
                            maechler@stat.math.ethz.ch/Jens Oehlschlaegel-Akiyoshi,
                            oehl@psyres-stuttgart.de)
       score.binary         Construct a score from a series of binary variables or
                            expressions
       sedit                A set of character handling functions written entirely
                            in S.  sedit() does much of what the UNIX sed
                            program does.  Other functions included are
                            substring.location, substring<-, replace.string.wild,
                            and functions to check if a string is numeric or
                            contains only the digits 0-9
       setpdf               Adobe PDF graphics setup for including graphics in books
                            and reports with nice defaults, minimal wasted space
       setps                Postscript graphics setup for including graphics in books
                            and reports with nice defaults, minimal wasted space
                            Internally uses psfig function by
                            Antonio Possolo (antonio@atc.boeing.com).
                            setps works with Ghostscript to convert .ps to .pdf
       setTrellis           Set Trellis graphics to use blank conditioning panel strips,
                            line thickness 1 for dot plot reference lines:
                            setTrellis(); 3 optional arguments
       show.col             Show colors corresponding to col=0,1,...,99
       show.pch             Show all plotting characters specified by pch=.
                            Just type show.pch() to draw the table on the
                            current device.
       showPsfrag           Use LaTeX to compile, and dvips and ghostview to
                            display a postscript graphic containing psfrag strings
       solvet               Version of solve with argument tol passed to qr
       somers2              Somers' rank correlation and c-index for binary y
       spearman             Spearman rank correlation coefficient  spearman(x,y)
       spearman.test        Spearman 1 d.f. and 2 d.f. rank correlation test
       spearman2            Spearman multiple d.f. rho^2, adjusted rho^2, Wilcoxon-Kruskal-
                            Wallis test, for multiple predictors
       spower               Simulate power of 2-sample test for survival under
                            complex conditions
                            Also contains the Gompertz2,Weibull2,Lognorm2 functions.
       spss.get             Enhanced importing of SPSS files using read.spss function
       src                  src(name) = source("name.s") with memory
       store                store an object permanently (easy interface to assign function)
       strmatch             Shortest unique identifier match
                            (Terry Therneau, therneau@mayo.edu)
       subset               More easily subset a data frame
       substi               Substitute one var for another when observations NA
       summarize            Generate a data frame containing stratified summary
                            statistics.  Useful for passing to trellis.
       summary.formula      General table making and plotting functions for summarizing
                            data
       symbol.freq          X-Y Frequency plot with circles' area prop. to frequency
       sys                  Execute unix() or dos() depending on what's running
       tex                  Enclose a string with the correct syntax for using
                            with the LaTeX psfrag package, for postscript graphics
       transace             ace() packaged for easily automatically transforming all
                            variables in a matrix
       transcan             automatic transformation and imputation of NAs for a
                            series of predictor variables
       trap.rule            Area under curve defined by arbitrary x and y vectors,
                            using trapezoidal rule
       trellis.strip.blank  To make the strip titles in trellis more visible, you can
                            make the backgrounds blank by saying trellis.strip.blank().
                            Use before opening the graphics device.
       t.test.cluster       2-sample t-test for cluster-randomized observations
       uncbind              Form individual variables from a matrix
       upData               Update a data frame (change names, labels, remove vars, etc.)
       units                Set or fetch "units" attribute - units of measurement for var.
       varclus              Graph hierarchical clustering of variables using squared
                            Pearson or Spearman correlations or Hoeffding D as similarities
                            Also includes the naclus function for examining similarities in
                            patterns of missing values across variables.
       xy.group             Compute mean x vs. function of y by groups of x
       xYplot               Like trellis xyplot but supports error bars and multiple
                            response variables that are connected as separate lines
       win.slide            Setup win.graph or win.printer using nice defaults for
                            presentations/slides/publications
       wtd.mean             
       wtd.var              
       wtd.quantile         
       wtd.ecdf             
       wtd.table            
       wtd.rank             
       wtd.loess.noiter     
       num.denom.setup      Set of function for obtaining weighted estimates
       zoom                 Zoom in on any graphical display
                            (Bill Dunlap, bill@statsci.com)

_S_y_s_t_e_m _O_v_e_r_r_i_d_e_s:

     Hmisc overrides the system function model.frame.default to allow
     for more elegant handling of NAs by allowing  the user to specify
     a global method for handling NAs using
     options(na.action='na.methodname').  Hmisc overrides the system
     subscripting method for factor vectors and date vectors, and it
     defines functions is.na.dates and is.na.times to check for NAs in
     date and time vectors.  The [.factor redefinition by Hmisc causes
     by default unused levels to be dropped from the factor vector's
     levels attribute when the vector is subscripted.  This can be
     overridden by using for example 'x <- x[,drop=FALSE]' or by
     specifying a system option as follows:
     'options(drop.factor.levels=FALSE)'.

     Hmisc also overrides the trelllis shingle function, which has a
     bug when its sole argument has a class (such as the "labelled"
     class created by the Hmisc label function). The shingle
     replacement has the default intervals argument set to
     sort(unique(unclass(x))) instead of sort(unique(x)).

_C_o_p_y_r_i_g_h_t _N_o_t_i_c_e:

     *GENERAL DISCLAIMER*
      This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 2, or
     (at your option) any later version.

     This program is distributed in the hope that it will be useful,
     but WITHOUT ANY WARRANTY; without even the implied warranty of
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
     General Public License for more details.

     In short: You may use it any way you like, as long as you don't
     charge money for it, remove this notice, or hold anyone liable for
     its results.  Also, please acknowledge the source and communicate
     changes to the author.

     If this software is used is work presented for publication, kindly
     reference it using for example:
      Harrell FE (2003): Hmisc S function library. Programs available
     from <URL: http://hesweb1.med.virginia.edu/biostat/s/Hmisc.html>.
      Be sure to reference S-Plus or R itself and other libraries used.

_A_c_k_n_o_w_l_e_d_g_e_m_e_n_t_s:

     This work was supported by grants from the Agency for Health Care
     Policy and Research (US Public Health Service) and the Robert Wood
     Johnson Foundation.

_A_u_t_h_o_r(_s):

     Frank E Harrell Jr
      Professor of Biostatistics
      Chair, Department of Biostatistics
      Vanderbilt University School of Medicine
      Nashville, Tennessee
      f.harrell@vanderbilt.edu

_R_e_f_e_r_e_n_c_e_s:

     See Alzola CF, Harrell FE (2002): An Introduction to S and the
     Hmisc and Design Libraries at <URL:
     http://hesweb1.med.virginia.edu/biostat/s/doc/splus.pdf> for
     extensive  documentation and examples for the Hmisc library.

