clusters                 package:evd                 R Documentation

_I_d_e_n_t_i_f_y _C_l_u_s_t_e_r_s _o_f _E_x_c_e_e_d_e_n_c_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Identify clusters of exceedences.

_U_s_a_g_e:

     clusters(data, u, r = 1, ulow = -Inf, rlow = 1, cmax = FALSE, keep.names
         = TRUE, plot = FALSE, xdata = seq(along = data), lvals = TRUE, lty =
         1, lwd = 1, pch = par("pch"), col = if(n > 250) NULL else "grey",
         xlab = "Index", ylab = "Data", ...)

_A_r_g_u_m_e_n_t_s:

    data: A numeric vector, which may contain missing values.

       u: A single value giving the threshold, unless a time varying
          threshold is used, in which case 'u' should be a vector of
          thresholds, typically with the same length as 'data' (or else
          the usual recycling rules are applied).

       r: A postive integer denoting the clustering interval length. By
          default the interval length is one.

    ulow: A single value giving the lower threshold, unless a time
          varying lower threshold is used, in which case 'ulow' should
          be a vector of lower thresholds, typically with the same
          length as 'data' (or else the usual recycling rules are
          applied). By default there is no lower threshold (or
          equivalently, the lower threshold is '-Inf').

    rlow: A postive integer denoting the lower clustering interval
          length. The lower clustering interval length is only relevant
          if it is less than the clustering interval length 'r' and if
          there exists a lower threshold (greater than '-Inf').

    cmax: Logical; if 'FALSE' (the default), a list containing the
          clusters of exceedences is returned. If 'TRUE' a numeric
          vector containing the cluster maxima is returned.

keep.names: Logical; if 'FALSE', the function makes no attempt to
          retain the names/indices of the observations within the
          returned object. If 'data' contains a large number of
          observations, this can make the function run much faster. The
          argument is mainly designed for internal use.

    plot: Logical; if 'TRUE' a plot is given that depicts the
          identified clusters, and the clusters (if 'cmax' is 'FALSE')
          or cluster maxima (if 'cmax' is 'TRUE') are returned
          invisibly. If 'FALSE' (the default), the following arguments
          are ignored.

   xdata: A numeric vector with the same length as 'data', giving the
          values to be plotted on the x-axis.

   lvals: Logical; should the values below the threshold and the line
          depicting the lower threshold be plotted?

lty, lwd: Line type and width for the lines depicting the threshold and
          the lower threshold.

     pch: Plotting character.

     col: Strips of colour 'col' are used to identify the clusters. An
          observation is contained in the cluster if the  centre of the
          corresponding plotting character is contained in the coloured
          strip. If 'col' is 'NULL' the strips are omitted. By default
          the strips are coloured '"grey"', but are omitted whenever
          'data' contains more than 250 observations.

xlab, ylab: Labels for the x and y axis.

     ...: Other graphics parameters.

_D_e_t_a_i_l_s:

     The clusters of exceedences are identified as follows. The first
     exceedence of the threshold initiates the first cluster. The first
     cluster then remains active until either 'r' consecutive values
     fall below (or are equal to) the threshold, or until 'rlow'
     consecutive values fall below (or are equal to) the lower
     threshold. The next exceedence of the threshold (if it exists)
     then initiates the second cluster, and so on. Missing values are
     allowed, in which case they are treated as falling below (or equal
     to) the threshold, but falling above the lower threshold.

_V_a_l_u_e:

     If 'cmax' is 'FALSE' (the default), a list with one component for
     each identified cluster. If 'cmax' is 'TRUE', a numeric vector
     containing the cluster maxima. In any case, the returned object
     has an attribute 'acs', giving the average cluster size (where the
     cluster size is defined as the number of exceedences within a
     cluster), which will be 'NaN' if there are no values above the
     threshold (and hence no clusters).

     If 'plot' is 'TRUE', the list of clusters, or vector of cluster
     maxima, is returned invisibly.

_S_e_e _A_l_s_o:

     'exi'

_E_x_a_m_p_l_e_s:

     data(portpirie)
     clusters(portpirie, 4.2, 3)
     clusters(portpirie, 4.2, 3, cmax = TRUE)
     clusters(portpirie, 4.2, 3, 3.8, plot = TRUE)
     clusters(portpirie, 4.2, 3, 3.8, plot = TRUE, lvals = FALSE)
     tvu <- c(rep(4.2, 20), rep(4.1, 25), rep(4.2, 20))
     clusters(portpirie, tvu, 3, plot = TRUE)

