dclust               package:extRemes               R Documentation

_D_e_c_l_u_s_t_e_r _d_a_t_a _b_y _r_u_n_s _d_e_c_l_u_s_t_e_r_i_n_g.

_D_e_s_c_r_i_p_t_i_o_n:

     Decluster data by assuming that exceedances belong to the same
     cluster if they are separated by fewer than 'r' (run length)
     values below a given threshold.

_U_s_a_g_e:

     dclust(xdat, u, r, cluster.by = NULL, verbose=getOption("verbose"))

_A_r_g_u_m_e_n_t_s:

    xdat: a single numeric vector of data to be declustered.

       u: single number or vector of thresholds. 

       r: run length 

cluster.by: If there are blocks implying a natural clustering that is
          to be preserved (e.g., if data cover several years, but only
          for a single season), this is a vector defining the blocks to
          ensure that clusters do not cross over from one block to
          another.

 verbose: logical whether to field progress information to screen or
          not.

_D_e_t_a_i_l_s:

     This function applies runs declustering to automatically decluster
     a dataset. To ensure that clusters do not cross natural or decided
     boundaries, use the 'cluster.by' option.  That is, suppose data
     are measured only in the summer, say from June 1 through August 1.
      In such a case, it is perhaps not desired to have a value from
     August 1, 2003 and June 1, 2004 in the same cluster.  To account
     for this, create a 'cluster.by' vector defining years in order to
     keep clusters within years. For the example of data from June 1 to
     August 1 (62 days), a vector like c(rep(1, 62), rep(2, 62), ...,
     rep(n, 62)) should be used for the 'cluster.by' argument.

     This function will return a vector of the same length as the
     original data vector, but with maximums from each cluster followed
     by 'filler' numbers that are below the given threshold, 'u'.

     Missing values are not handled.  The function will still run, but
     the results will be questionable.

_V_a_l_u_e:

     A list with components: 

 xdat.dc: Maximums from each cluster with additional filler values
          below the given threshold 'u' in order to maintain the same
          length as the original data vector 'xdat'.  This is for
          compatability with extRemes GUI data object of class
          ``ev.data".

ncluster: The number of clusters found by runs declustering.

   clust: numeric vector giving the clusters.

_A_u_t_h_o_r(_s):

     Eric Gilleland

_R_e_f_e_r_e_n_c_e_s:

     Coles, Stuart (2001).  An Introduction to Statistical Modeling of
     Extreme Values.  Springer-Verlag, London.

     Gilleland, Eric and Katz, Richard W. Tutorial for the 'Extremes
     Toolkit: Weather and Climate Applications of Extreme Value
     Statistics.' <URL: http://www.assessment.ucar.edu/toolkit>, 2005.

_E_x_a_m_p_l_e_s:

     # Load a dataset.
     data(Tphap)

     plot( Tphap[,"MaxT"])
     abline( h=115)

     # Decluster using a threshold of 115 degrees and a run length of 'r=1'.
     temp <- dclust(xdat=Tphap[,"MaxT"], u=115, r=1, cluster.by = Tphap[,"Year"])
     temp[["ncluster"]] # See how many clusters were found.

     # Now do the same as above, but with a run length of 3 for comparison.
     # Note: 'r=2' gives same clusters as 'r=1' for these data.
     temp2 <- dclust(xdat=Tphap[,"MaxT"], u=115, r=3, cluster.by = Tphap[,"Year"])
     temp2[["ncluster"]]

     # See Gilleland et al. (2005) for more.

