fpot                   package:evd                   R Documentation

_P_e_a_k_s _O_v_e_r _T_h_r_e_s_h_o_l_d _M_o_d_e_l_l_i_n_g _u_s_i_n_g _t_h_e _G_e_n_e_r_a_l_i_z_e_d _P_a_r_e_t_o
_o_r _P_o_i_n_t _P_r_o_c_e_s_s _R_e_p_r_e_s_e_n_t_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Maximum-likelihood fitting for peaks over threshold modelling,
     using the Generalized Pareto or Point Process representation,
     allowing any of the parameters to be held fixed if desired.

_U_s_a_g_e:

     fpot(x, threshold, model = c("gpd", "pp"), start, npp = length(x),
         cmax = FALSE, r = 1, ulow = -Inf, rlow = 1, mper = NULL, ...,
         std.err = TRUE, corr = FALSE, method = "BFGS", warn.inf = TRUE)

_A_r_g_u_m_e_n_t_s:

       x: A numeric vector. If this contains missing values, those
          values are treated as if they fell below the threshold.

threshold: The threshold.

   model: The model; either '"gpd"' (the default) or '"pp"', for the
          Generalized Pareto or Point Process representations
          respectively.

   start: A named list giving the initial values for the parameters
          over which the likelihood is to be maximized. If 'start' is
          omitted the routine attempts to find good starting values
          using moment estimators.

     npp: The data should contain 'npp' observations per ``period'',
          where the return level plot produced by 'plot.pot' will
          represent return periods in units of ``periods''. By default
          'npp = length(x)', so that the ``period'' is the period of
          time over which the entire data set is collected. It may
          often be useful to change this default so that more sensible
          units are used. For example, if yearly periodic units are
          required, use 'npp = 365.25' for daily data and 'npp = 52.18'
          for weekly data. The argument only makes a difference to the
          actual fit if 'mper' is not 'NULL' or if 'model = "pp"' (see
          *Details*).

    cmax: Logical; if 'FALSE' (the default), the model is fitted using
          all exceedences over the threshold. If 'TRUE', the model is
          fitted using cluster maxima, using clusters of exceedences
          derived from 'clusters'.

r, ulow, rlow: Arguments used for the identification of clusters of
          exceedences (see 'clusters'). Ignored if 'cmax' is 'FALSE'
          (the default).

    mper: Controls the parameterization of the generalized Pareto
          model. Should be either 'NULL' (the default), or a positive
          number (see *Details*). If 'mper' is not 'NULL' and 'model =
          "pp"', an error is returned.

     ...: Additional parameters, either for the model or for the
          optimization function 'optim'. If parameters of the model are
          included they will be held fixed at the values given (see
          *Examples*).

 std.err: Logical; if 'TRUE' (the default), the standard errors are
          returned.

    corr: Logical; if 'TRUE', the correlation matrix is returned.

  method: The optimization method (see 'optim' for details).

warn.inf: Logical; if 'TRUE' (the default), a warning is given if the
          negative log-likelihood is infinite when evaluated at the
          starting values.

_D_e_t_a_i_l_s:

     The exeedances over the threshold 'threshold' (if 'cmax' is
     'FALSE') or the maxima of the clusters of exeedances (if 'cmax' is
     'TRUE') are (if 'model = "gpd"') fitted to a generalized Pareto
     distribution (GPD) with location 'threshold'. If 'model = "pp"'
     the exceedances are fitted to a non-homogeneous Poisson process
     (Coles, 2001).

     If 'mper' is 'NULL' (the default), the parameters of the model (if
     'model = "gpd"') are 'scale' and 'shape', for the scale and shape
     parameters of the GPD. If 'model = "pp"' the parameters are 'loc',
     'scale' and 'shape'. Under 'model = "pp"' the parameters can be
     interpreted as parameters of the Generalized Extreme Value
     distribution, fitted to the maxima of 'npp' random variables. In
     this case, the value of 'npp' should be reasonably large.

     For both characterizations, the shape parameters are equivalent.
     The scale parameter under the generalized Pareto characterization
     is equal to b + s(u - a), where a, b and s are the location, scale
     and shape parameters under the Point Process characterization, and
     where u is the threshold.

     If 'mper' = m is a positive value, then the generalized Pareto
     model is reparameterized so that the parameters are 'rlevel' and
     'shape', where 'rlevel' is the m ``period'' return level, where
     ``period'' is defined via the argument 'npp'.

     The m ``period'' return level is defined as follows. Let G be the
     fitted generalized Pareto distribution function, with location
     'threshold' = u, so that 1 - G(z) is the fitted probability of an
     exceedance over z > u given an exceedance over u. The fitted
     probability of an exceedance over z > u is therefore p(1 - G(z)),
     where p is the estimated probabilty of exceeding u, which is given
     by the empirical proportion of exceedances. The m ``period''
     return level z_m satisfies p(1 - G(z_m)) = 1/(mN), where N is the
     number of points per period (multiplied by the estimate of the
     extremal index, if cluster maxima are fitted). In other words, z_m
     is the quantile of the fitted model that corresponds to the upper
     tail probability 1/(mN). If 'mper' is infinite, then z_m is the
     upper end point, given by 'threshold' minus 'scale'/'shape', and
     the shape parameter is then restricted to be negative.

_V_a_l_u_e:

     Returns an object of class 'c("pot","uvevd","pot")'.

     The generic accessor functions 'fitted' (or 'fitted.values'),
     'std.errors', 'deviance', 'logLik' and 'AIC' extract various
     features of the returned object.

     The function 'profile' can be used to obtain deviance profiles for
     the model parameters. In particular, profiles of the m 'period'
     return level z_m can be calculated and plotted when 'mper' = m.
     The function 'anova' compares nested models. The function 'plot'
     produces diagnostic plots.

     An object of class 'c("pot","uvevd","pot")' is a list containing
     the following components 

estimate: A vector containing the maximum likelihood estimates.

 std.err: A vector containing the standard errors.

   fixed: A vector containing the parameters of the model that have
          been held fixed.

   param: A vector containing all parameters (optimized and fixed).

deviance: The deviance at the maximum likelihood estimates.

    corr: The correlation matrix.

convergence, counts, message: Components taken from the list returned
          by 'optim'.

threshold, r, ulow, rlow, npp: The arguments of the same name.

   nhigh: The number of exceedences (if 'cmax' is 'FALSE') or the
          number of clusters of exceedences (if 'cmax' is 'TRUE').

nat, pat: The number and proportion of exceedences.

  extind: The estimate of the extremal index (i.e. 'nhigh' divided by
          'nat'). If 'cmax' is 'FALSE', this is 'NULL'.

    data: The data passed to the argument 'x'.

exceedances: The exceedences, or the maxima of the clusters of
          exceedences.

    mper: The argument 'mper'.

   scale: The scale parameter for the fitted generalized Pareto
          distribution. If 'mper' is 'NULL' and 'model = "gpd"' (the
          defaults), this will also be an element of 'param'.

    call: The call of the current function.

_W_a_r_n_i_n_g:

     The standard errors and the correlation matrix in the returned
     object are taken from the observed information, calculated by a
     numerical approximation. They must be interpreted with caution
     when the shape parameter is less than -0.5, because the usual
     asymptotic properties of maximum likelihood estimators do not then
     hold (Smith, 1985).

_R_e_f_e_r_e_n_c_e_s:

     Smith, R. L. (1985) Maximum likelihood estimation in a class of
     non-regular cases. _Biometrika_, *72*, 67-90.

_S_e_e _A_l_s_o:

     'anova.evd', 'optim', 'plot.uvevd', 'profile.evd',
     'profile2d.evd', 'mrlplot', 'tcplot'

_E_x_a_m_p_l_e_s:

     uvdata <- rgpd(100, loc = 0, scale = 1.1, shape = 0.2)
     M1 <- fpot(uvdata, 1)
     M2 <- fpot(uvdata, 1, shape = 0)
     anova(M1, M2)
     par(mfrow = c(2,2))
     plot(M1)
     ## Not run: M1P <- profile(M1)
     ## Not run: plot(M1P)

     M1 <- fpot(uvdata, 1, mper = 10)
     M2 <- fpot(uvdata, 1, mper = 100)
     ## Not run: M1P <- profile(M1, which = "rlevel", conf=0.975, mesh=0.1)
     ## Not run: M2P <- profile(M2, which = "rlevel", conf=0.975, mesh=0.1)
     ## Not run: plot(M1P)
     ## Not run: plot(M2P)

