surrogate              package:fractal              R Documentation

_S_u_r_r_o_g_a_t_e _d_a_t_a _g_e_n_e_r_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     This function can be used to generate surrogate time series via
     various frequency domain bootstrapping techniques. Bootstrapping
     has been used (in the statistics community) to assess the sampling
     variability of certain statistics. The nonlinear dynamics
     community typically uses bootstrapping to detect nonlinear
     structure in stationary time series. Given a time series, this
     function is used to generate surrogate series via Theiler's
     Amplitude Adjusted Fourier Transform (AAFT), Theiler's phase
     randomization, Davies and Harte's Circulant Embedding (CE)
     technique, or Davison and Hinkley's (DH) phase and amplitude
     randomization technique.

     Theiler's techniques produce so-called _constrained realizations_
     since some statistical aspect of the original data preserved (the
     histogram for the AAFT and the periodogram for the phase
     randomization). The other techniques, ciruclant embedding and
     Davison-Hinkley, are non-constrained as both the amplitudes and
     phases of the original series are randomized.

_U_s_a_g_e:

     surrogate(x, method="ce", sdf=NULL, seed=0)

_A_r_g_u_m_e_n_t_s:

       x: a vector containing a uniformly-sampled real-valued time
          series.

  method: a character string representing the method to be used to
          generate surrogate data. Choices are:

          '"_a_a_f_t"' Theiler's Amplitude Adjusted Fourier Transform.

          '"_p_h_a_s_e"' Theiler's phase randomization.

          '"_c_e"' Davies and Harte's Circulant Embedding.

          '"_d_h"' Davison and Hinkley's phase and amplitude
               randomization. .in -5

               Default: '"ce"'.

     sdf: an object of class 'SDF', containing a single-sided spectral
          density function estimation (corresponding to the original
          data) over normalized frequencies f(k)=k/(2N) for k=0,...,N
          where N is the number of samples in the original time series.
          This argument is only used for the circulant embedding
          method. Default: 'NULL' unless the circulant embedding method
          is used, and then it is 'SDF(x, method="multitaper",
          recenter=TRUE, taper=h, single.sided=T)' where 'h =
          taper(type="sine", n.sample=N, n.taper=5, norm=TRUE)'.

    seed: a positive integer representing the initial seed value to use
          for the random number generator. If 'seed=0', the current
          time is used as a means of generating a (unique) seed value.
          Otherwise, the specified seed value is used. Default: '0'.

_D_e_t_a_i_l_s:

     The algorithms are detailed as follows:

     _p_h_a_s_e The discrete Fourier transform of a time series is
          calculated and the phase at each frequency is randomized to
          be uniformly distributed on [0, 2*PI]. Phase symmetry is
          preserved so that an inverse DFT forms a purely real
          surrogate. Null hypothesis: the original data come from a
          linear Gaussian process. Side effect: the periodogram of the
          surrogate and original time series are the same.

     _a_a_f_t An N-point normally distributed realization of a white noise
          process is created, where N is the length of 'x', and sorted
          to have the same rank as 'x' (e.g., if rank(x[t]) = 5 it
          means that x[t] is the fifth smallest element of 'x'). The
          result is then phase randomized and its rank (r) is then
          calculated. The surrogate is then created by rank ordering
          'x' using r. Null hypothesis: the observed time series is a
          monotonic nonlinear transformation of a Gaussian process.
          Side effect: the amplitude distribution (histogram) of the
          surrogate and original time series are the same.

     _c_e The circulant embedding technique is based upon generating
          surrogates whose estimated SDF (e.g., a periodogram) is not
          constrained to be the same as that of the original series
          (for references for details).

     _d_h The Davison-Hinkley technique is based upon generating
          surrogates by randomizing both the phases and the amplitudes
          in the frequency domain followed by an inversion back to the
          time domain. .in -5 

_V_a_l_u_e:

     an object of class 'surrogate'.

_S_3 _M_E_T_H_O_D_S:


     _p_l_o_t plots the surrogate data realizations. The following options
          may be used to adjust the plot components:

          _s_h_o_w. A character string defining the data to display.
               Choices are '"series"', '"surrogate"', or '"both"' for
               plots corresponding to the original series, surrogate
               series, or both, respectively. Default: '"surrogate"'.

          _t_y_p_e Character string denoting the type of data to plot.
               Options are '"time"' for time history, '"sdf"' for a
               multitaper spectral density function estimation, '"pdf"'
               for a probability density function estimation, and
               '"lag"' for a two-dimensional embedding (lag plot.
               Default: '"time"'.

          _s_t_a_c_k A logical flag. If 'TRUE', the 'stackPlot' function is
               called as opposed to the default plot function. As
               'stackPlot' requires a common abscissa, this option is
               only available for 'type="time"' (time history) or
               'type="sdf"' (spectral density function plot). Default:
               'TRUE'.

          _x_l_a_b Character string denoting the x-axis label for the
               '"time"' and '"sdf"' '"pdf"' types. Default: "Time", the
               series name, and "Frequency (Hz)", respectively.

          _y_l_a_b Character string denoting the y-axis label for the
               '"time"' style. Default: the series name.

          _c_e_x Character expansion factor (same as the 'cex' argument of
               the 'par' function). Default: '1'.

          _a_d_j._m_a_i_n Title adjustment ala the 'adj' argument of the 'par'
               function). Default: '1'.

          _l_i_n_e._m_a_i_n Line spacing for title ala the 'line' argument of
               the 'text' function). Default: '0.5'.

          _c_o_l._s_e_r_i_e_s A character string or integer denoting the color
               to use when plotting data corresponding to the original
               series. See the 'colors' function for more details.
               Default: '"black"'.

          _c_o_l._s_u_r_r_o_g_a_t_e A character string or integer denoting the
               color to use when plotting data corresponding to the
               surrogate series. See the 'colors' function for more
               details. Default: '"red"'.

          ... Additional plot arguments (set internally by the 'par'
               function). .in -5


          _p_r_i_n_t prints a summary of the surrogate data realization.
               Available options are:

               ... Additional print arguments used by the standard
                    'print' function. .in -5


_R_e_f_e_r_e_n_c_e_s:

     J. Theiler and S. Eubank and A. Longtin and B. Galdrikian and J.D.
     Farmer (1992), Testing for nonlinearity in time series:  the
     method of surrogate data, _Physica D: Nonlinear Phenomena_, *58*,
     77-94.

     Davies,R.B.and Harte,D.S.(1987). Tests for the Hurst effect,
     _Biometrika_, *74*, 95-102.

     D.B. Percival and W.L.B. Constantine (2002), Exact Simulation of
     Gaussian Time Series from Nonparametric Spectral Estimates with
     Application to Bootstrapping, _Statistics and Computing_, under
     review.

     D.B. Percival and A. Walden (1993), _Spectral Analysis for
     Physical Applications: Multitaper and Conventional Univariate
     Techniques_, Cambridge University Press, Cambridge, UK.

     D. B. Percival, S. Sardy and A. C. Davison, _Wavestrapping Time
     Series: Adaptive Wavelet-Based Bootstrapping_, in W. J.
     Fitzgerald, R. L. Smith, A. T. Walden and P. C. Young (Eds.),
     _Nonlinear and Nonstationary Signal Processing_, Cambridge,
     England: Cambridge University Press, 2001.

     D.T. Kaplan (1995), Nonlinearity and Nonstationarity: The Use of
     Surrogate Data in Interpreting Fluctuations in Heart Rate,
     _Proceedings of the 3rd Annual Workshop on Computer Applications
     of Blood Pressure and Heart Rate Signals_, Florence, Italy, 4-5
     May.

_S_e_e _A_l_s_o:

     'infoDim', 'corrDim'.

_E_x_a_m_p_l_e_s:

     ## create surrogate data sets using circulant 
     ## embedding method 
     surr <- surrogate(beamchaos, method="ce")

     ## print the result 
     print(surr)

     ## plot and compare various statistics of the 
     ## surrogate and original time series 
     plot(surr, type="time")
     plot(surr, type="sdf")
     plot(surr, type="lag")
     plot(surr, type="pdf")

     ## create comparison time history 
     plot(surr, show="both", type="time")

