prepare              package:clustTool              R Documentation

_F_u_n_c_t_i_o_n _f_o_r _t_r_a_n_f_o_r_m_a_t_i_o_n _a_n_d _s_t_a_n_d_a_r_d_i_s_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     This function can used for transformation and standardisation of
     the data.

_U_s_a_g_e:

     prepare(x, scaling = "classical", transformation = "logarithm", powers = "none")

_A_r_g_u_m_e_n_t_s:

       x: data frame or matrix 

 scaling: Scaling of the data. 

          Possible values are: "classical", "robust", "none" 

transformation: Transformation of the data. 

          Possible values are: "logarithm", "boxcox", "bcOpt",
          "logratio","logcentered","iso","none" 

  powers: Powers for Box-Cox transformation for each variable (if
          "boxcox" is chosen) 

_D_e_t_a_i_l_s:

     *Transformation*:

     "logarithm" replaces the values of x with the natural logarithm by
     using function 'log'.

     "boxcox" apply a Box-Cox transformation on each variable. Powers
     must be specified.      

     "bcOpt" apply a Box-Cox transformation on each variable. Powers
     are calculated with function 'box.cox.powers'.

     "none" is also possible.

     Transformation before clustering: Cluster analysis in general does
     not need normally distributed data. However, it is advisable that
     heavily skewed data are first transformed to a more symmetric
     distribution. If a good cluster structure exists for a variable we
     can expect a distribution which has two or more modes. A
     transformation to more symmetry will preserve the modes but remove
     large skewness. 

     *Standardisation*:

     "classical" apply a _z_-Transformation on each variable by using
     function 'scale'.

     "robust" apply a robustified _z_-Transformation by using median
     and MAD. 

     "none" is also possible.

     Standardisation before clustering: Standardisation is needed if
     the variables show a striking difference in the amount of
     variablity.

_V_a_l_u_e:

     Transformed and standardised data.

_A_u_t_h_o_r(_s):

     Matthias Templ

_S_e_e _A_l_s_o:

     'scale', 'box.cox.powers'

_E_x_a_m_p_l_e_s:

     require(mvoutlier)
     data(humus)
     x <- humus[,4:40]
     xNew <- prepare(x, scaling="classical", transformation="logarithm")

