na                  package:fSeries                  R Documentation

_H_a_n_d_l_i_n_g _M_i_s_s_i_n_g _V_a_l_u_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     A collection and description of functions  for handling missing
     values in 'timeSeries'  objects or in objects which can be
     transformed  into a vector or a two dimensional matrix. 

     The functions are listed by topic. 

       'na.omit'       Handles NAs,
       'removeNA'      Removes NAs from a matrix object,
       'substituteNA'  substitute NAs by zero, the column mean or median,
       'interpNA'      interpolates NAs using R's "approx" function.

_U_s_a_g_e:

     ## S3 method for class 'timeSeries':
     na.omit(object, method = c("r", "s", "z", "ir", "iz", "ie"), 
         interp = c("before", "linear", "after"), ...)

     removeNA(x, ...)
     substituteNA(x, type = c("zeros", "mean", "median"), ...)
     interpNA(x, method = c("linear", "before", "after"), ...)

_A_r_g_u_m_e_n_t_s:

interp, type: [nna.omit][substituteNA] - 
           Three alternative methods are provided to remove NAs from
          the data:  'type="zeros"' replaces the missing values by
          zeros, 'type="mean"' replaces the missing values by the
          column mean, 'type="median"' replaces the missing values by
          the the column median. 

  method: [na.omit] - 
           Specifies the method how to handle NAs.  One of the applied
          vector strings: 
           'method="s"' na.rm = FALSE, skip, i.e. do nothing, 
          'method="r"' remove NAs,  'method="z"' substitute NAs by
          zeros,  'method="ir"' interpolate NAs and  remove NAs at the
          beginning and end of the series,  'method="iz"' interpolate
          NAs and  substitute NAs at the beginning and end of the
          series,  'method="ie"' interpolate NAs and  extrapolate NAs
          at the beginning and end of the series,  [interpNA] - 
           Specifies the method how to interpolate the matrix column by
          column. One of the applied vector strings: 
          'method="linear"', 'method="before"' or  'method="after"'.
           For the  interpolation the function 'approx' is used. 

  object: an object of class("timeSeries"). 

       x: a numeric matrix, or any other object which can be
          transformed into a matrix through 'x = as.matrix(x, ...)'. If
          'x' is a vector, it will be transformed into a
          one-dimensional matrix. 

     ...: arguments to be passed to the function 'as.matrix'. 

_D_e_t_a_i_l_s:

     *Missing Values in Price and Index Series:*

     Applied to 'timeSeries' objects the function 'removeNA' just
     removes rows with NAs from the series. For an interpolation of
     time series points one can use the function 'interpNA'. Three
     different methods of interpolation are offered: '"linear"' does a
     linear interpolation, '"before"' uses the previous value, and
     '"after"' uses the following value. Note, that the  interpolation
     is done on the index scale and not on the time scale.

     *Missing Values in Return Series:*

     For return series the function 'substituteNA' may be useful. The 
     function allows to fill missing values either by 'method="zeros"',
      the 'method="mean"' or the 'method="median"' value of the 
     appropriate columns.

_N_o_t_e:

     The functions 'removeNA', 'substituteNA' and 'interpNA'  are older
     implementations. Please use in all cases if possible the  new
     function 'na.omit'.

_A_u_t_h_o_r(_s):

     Raphael Gottardo for the 'knn' function, 
      Diethelm Wuertz for the Rmetrics R-port.

_R_e_f_e_r_e_n_c_e_s:

     Troyanskaya O., Cantor M., Sherlock G., Brown P., Hastie T., 
     Tibshirani R., Botstein D., Altman R.B., (2001);  _Missing Value
     Estimation Methods for DNA microarrays_ Bioinformatics 17,
     520-525.

_E_x_a_m_p_l_e_s:

     ## Create a Matrix with NAs:
        X = matrix(rnorm(100), ncol = 5)
        # a single NA inside:
        X[3, 5] = NA
        # three in a row inside:
        X[17, 2:4] = c(NA, NA, NA)
        # three in a column inside:
        X[13:15, 4] = c(NA, NA, NA)
        # two at the right border:
        X[11:12, 5] = c(NA, NA)
        # one in the lower left corner:
        X[20, 1] = NA
        print(X)
          
     ## removeNA -
        # Remove rows with NA's
        removeNA(X)
        # Now we have only 12 lines!
        
     ## substiuteNA -
        # Subsitute NA's by zeros or column mean
        substituteNA(X, type = "zeros")
        substituteNA(X, type = "mean")
        
     ## interpNA - 
        # Interpolate NA's liearily:
        interpNA(X, method = "linear")
        # Note the corner missing value cannot be interpolated!
        # Take previous values in a column:
        interpNA(X, method = "before")
        # Also here, the corner value is excluded

