bctrans                 package:alr3                 R Documentation

_U_n_i_v_a_r_i_a_t_e _a_n_d _m_u_l_t_i_v_a_r_i_a_t_e _t_r_a_n_s_f_o_r_m_a_t_i_o_n_s _t_o _n_o_r_m_a_l_i_t_y

_D_e_s_c_r_i_p_t_i_o_n:

     Estimates multivariate power transformations to multinormality  by
     a maximum likelihood-like method. The univariate case is obtained
     when only  one variable is specified.

_U_s_a_g_e:

     bctrans(formula, data = NULL, subset, na.action = na.omit, ...)

     ## Use when you have a matrix or data.frame:
     bctrans1(X, Y = NULL, start = NULL, family = "box.cox", call=NULL,...) 

     ## S3 methods for class 'bctrans'
     lrt.bctrans(object, lrt=NULL, ones=TRUE, zeroes=TRUE)

_A_r_g_u_m_e_n_t_s:

 formula: A formula, giving the variables to be transformed.  The
          formula can be _one-sided_, of the form '~ X1+X2+X3', or
          _two-sided_, of the form 'Y~X1+X2+X3'.  In the latter case,
          the response is not  used in the transformation, but it will
          be used in the 'plot' method.  If you have previously
          computed a linear model fit, say 'm1', then you can use 'm1'
          in place of the formula.

    data: a data.frame (or list) from which the variables in the
          formula should be taken.

  subset: an optional vector specifying a subset of observations to be
          used.

na.action: If set to na.omit, the default, missing values are
          permitted.  If set to na.fail, missing values are  not
          permitted.

     ...: In bctrans, these are additional arguments passed to
          'bctrans1', and described below. In bctrans1, these are
          additional arguments passed to the function maximizer
          'optim'.

       X: A vector, matrix, or data frame whose columns are to be
          transformed.

       Y: If present this vector will be part of the object created,
          and will be used in drawing plots.  It is not used for
          finding transformations.

   start: Starting values for the power transformation parameters;  if
          NULL (the default), univariate transformations will be
          computed and  used as the start values. 

  family: The family of transformations.  The most common is
          '"box.cox"' for the Box-Cox transformation.  The
          '"yeo.johnson"' transformations are used if some elements of
          X are negative or zero.  The family '"power"' is used only in
          the 'plot.bctrans' and 'add.trans' functions to give basic
          power transformations, which can't be normalized to have
          Jacobian one.  The argument lambda indexes the family of
          transformations psi(X,lambda)

  object: In 'lrt.bctrans', the name of a 'bctrans' object.

     lrt: In the 'lrt.bctrans' command, a list of vectors each of
          length equal to the number of columns in X.  A Likelihood
          ratio test that the transformation parameters equal each of
          these vectors will be performed.

    ones: In 'lrt.bctrans', if TRUE test all the transformation
          parameters equal to one against a general alternative.

  zeroes: In 'lrt.bctrans', if TRUE test all the transformation
          parameters equal to zero against a general alternative.

    call: Not to be set by the user.

_D_e_t_a_i_l_s:

     Given a matrix X with columns  X1, ..., Xp, this routine selects
     transformation  parameters lambda1,...,lambdap  from a
     one-parameter family of transformations such  that the transformed
     variables  psi(X1,lambda1),...,psi(Xp,lambdap) are as close to
     multivariate normal as possible.

     The function uses the family of transformation you specify.  If
     you use the family 'box.cox' to select a transformation, it is
     usual to use standard power transformations in further
     calculations.

_V_a_l_u_e:

     'bctrans' returns an object of class 'bctrans', which may be
     printed or summarized.  It is a list with components 'X'
     containing the input data, 'family' the family used, 'start' the
     starting values, and,  'optim', the results from a call to
     'optim', the function optimizer used in the  routine.

_A_u_t_h_o_r(_s):

     A substantial part of this code is borrowed from the function
     'box.cox.powers' in the 'car' package, written by John Fox, and
     documented in Fox (2002).  It is based on a similar function in
     Arc; see Cook and Weisberg (1999). It was modified by Sanford
     Weisberg, sandy@stat.umn.edu and renamed  'bctrans'.

_R_e_f_e_r_e_n_c_e_s:

     Box, G. E. P. and Cox, D. R. (1964) An analysis of
     transformations. _Journal of the Royal Statisistical Society,
     Series B_. 26 211-46.

     Cook, R. D. and Weisberg, S. (1999).  _Applied Regression
     Including Computing and Graphics_.  Wiley.

     Fox, J. (2002).  _R and S-Plus Companion to Applied Regression_. 
     Sage. 

     Velilla, S. (1993).  A note on the multivariate Box-Cox
     transformation to  normality.  _Statistics and Probability
     Letters_, 17, 259-263.

     Weisberg, S. (2005) _Applied Linear Regression_, third edition.
     Wiley.

     Yeo, I. and Johnson, R. (2000).  A new family of power
     transformations to improve normality or symmetry.   _Biometrika_,
     87, 954-959.

_S_e_e _A_l_s_o:

     'powtran', 'optim', 'pairs', 'inv.res.plot', 'plot.bctrans'

_E_x_a_m_p_l_e_s:

     data(highway)
     b <- highway[,c(8,1,2,10,5)] # select interesting columns
     summary(ans <- bctrans1(b,family="yeo.johnson")) # zeros ==> use yeo.johnson
     # or, compute using a formula and get the same answer.
     summary(ans2 <- bctrans(~Len+ADT+Trks+Shld+Sigs,data=highway,family="yeo.johnson"))
     # or, first fit an lm, and extract the formula
     m1 <- lm(Rate~Len+ADT+Trks+Shld+Sigs,data=highway)
     summary(ans3 <- bctrans(m1,data=highway,family="yeo.johnson"))
     # work with the response
     b$Sigs <- (round(b$Sigs*b$Len)+1)/b$Len # redefine so no zeroes
     summary(ans <- bctrans1(b)) # fit with box.cox
     lrt.bctrans(ans,lrt=list(c(0,0,-1,1,0)))
     plot(ans,family="power") # plot, but use ordinary powers
     b <- cbind(b,powtran(ans)) # add transformed variables to data frame

