GpdFit               package:fExtremes               R Documentation

_M_o_d_e_l_l_i_n_g _t_h_e _G_e_n_e_r_a_l_i_z_e_d _P_a_r_e_t_o _D_i_s_t_r_i_b_u_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     A collection and description of functions to model  the
     Generalized Pareto Distribution, GPD, based on  R's 'evir'
     package. Two approaches for parameter  estimation are provided:
     Maximum likelihood estimation  and the probability weighted moment
     method. 

     The functions are:

       'gpdSim'           generates data from the GPD,
       'gpdFit'           fits empirical or simulated data to the distribution,
       'print'            print method for a fitted GPD object of class ...,
       'plot'             plot method for a fitted GPD object,
       'summary'          summary method for a fitted GPD object,
       'gpdqPlot'         estimation of high quantiles,
       'gpdquantPlot'     variation of high quantiles with threshold,
       'gpdriskmeasures'  prescribed quantiles and expected shortfalls,
       'gpdsfallPlot'     expected shortfall with confidence intervals,
       'gpdshapePlot'     variation of shape with threshold,
       'gpdtailPlot'      plot of the tail.

_U_s_a_g_e:

     gpdSim(model = list(shape = 0.25, location = 0, scale = 1), n = 1000)
     gpdFit(x, threshold = NA, nextremes = NA, type = c("mle", "pwm"),
         information = c("observed", "expected"), ...)

     ## S3 method for class 'gpdFit':
     print(x, ...)
     ## S3 method for class 'gpdFit':
     plot(x, which = "all", ...)
     ## S3 method for class 'gpdFit':
     summary(object, doplot = TRUE, which = "all", ...)

     gpdqPlot(x, pp = 0.99, ci.type = c("likelihood", "wald"), ci.p = 0.95, 
         like.num = 50)
     gpdquantPlot(data, p = 0.99, models = 30, start = 15, end = 500, 
         reverse = TRUE, ci = 0.95, autoscale = TRUE, labels = TRUE, ...)
     gpdriskmeasures(x, plevels = c(0.99, 0.995, 0.999, 0.9995, 0.9999))
     gpdsfallPlot(x, pp = 0.99, ci.p = 0.95, like.num = 50)
     gpdshapePlot(data, models = 30, start = 15, end = 500, reverse = TRUE,
         ci = 0.95, autoscale = TRUE, labels = TRUE, ...) 
     gpdtailPlot(fit, optlog = NA, extend = 1.5, labels = TRUE, ...)

_A_r_g_u_m_e_n_t_s:

autoscale: whether or not plot should be automatically scaled; if not,
          xlim and ylim graphical parameters may be entered. 

      ci: the probability for asymptotic confidence band; for no 
          confidence band set to zero. 

    ci.p: the probability for confidence interval (must be less  than
          0.999). 

 ci.type: the method for calculating a confidence interval: 
          '"likelihood"' or '"wald"'. 

    data: a numeric vector of data. 

  doplot: a logical. Should the results be plotted? 

  extend: optional argument for plots 1 and 2 expressing how far x-axis
           should extend as a multiple of the largest data value. This 
          argument must take values greater than 1 and is useful for
          showing estimated quantiles beyond data. 

     fit: [print][plot][summary] - 
           print method, a fitted object of class '"gpd"'. 

information: whether standard errors should be calculated with
          '"observed"' or '"expected"' information. This only applies
          to the maximum likelihood method; for the
          probability-weighted moments method '"expected"' information
          is used if possible. 

  labels: optional argument for plots 1 and 2  specifying whether or
          not  axes should be labelled. 

like.num: the number of times to evaluate profile likelihood. 

   model: [gpdsim] - 
           a list with components 'shape', 'location' and  'scale'
          giving the parameters of the GPD distribution. By default the
          shape parameter has the value 0.25, the location is zero and
          the scale is one.

  models: the number of consecutive gpd models to be fitted. 

       n: [gpdsim] - 
           lnumber of generated data points, an integer value. 

nextremes: [gpdFit] - 
           the number of upper extremes to be used (either this or 
          'threshold' must be given but not both). 

  object: [summary] - 
           a fitted object of class '"gpdFit"'. 

  optlog: optional argument for plots 1 and 2 giving a particular
          choice  of logarithmic axes: '"x"' x-axis only; '"y"' y-axis 
          only; '"xy"' both axes; '""' neither axis. 

plevels, p, pp: a vector of probability levels, the desired probability
          for the  quantile estimate (e.g. 0.99 for the 99th
          percentile). 

 reverse: should plot be by increasing threshold ('TRUE') or number  of
          extremes ('FALSE'). 

start, end: the lowest and maximum number of exceedances to be
          considered. 

threshold: a threshold value (either this or 'nextremes' must be given 
          but not both). 

    type: a character string selecting the desired estimation mehtod,
          either '"mle"' for the maximum likelihood mehtod or '"pwm"'
          for  the probability weighted moment method. By default, the
          first will  be selected. Note, the function 'gpd' uses
          '"ml"'.     

   which: if 'which' is set to '"ask"' the function will  interactively
          ask which plot should be displayed. By default this value is
          set to 'FALSE' and then those plots will be displayed for
          which the elements in the logical vector 'which' ar set to
          'TRUE'; by default all four elements are set to '"all"'. 

       x: [gpdFit] - 
           the data vector. Note, there are two different names for the
          first argument 'x' and 'data' depending  which function name
          is used, either 'gpdFit' or the  EVIS synonyme 'gpd'. 
           [print][plot] - 
           a fitted object of class '"gpdFit"'. 

     ...: control parameters and plot parameters optionally passed to
          the  optimization and/or plot function. Parameters for the
          optimization function are passed to components of the
          'control' argument of 'optim'.   

_D_e_t_a_i_l_s:

     *Simulation:* 

      'gpdSim' simulates data from a Generalized Pareto  distribution. 

     *Parameter Estimation:* 

      'gpdFit' fits the model parameters either by the probability 
     weighted moment method or the maxim log likelihood method.  The
     function returns an object of class '"gpd"'  representing the fit
     of a generalized Pareto model to excesses over  a high threshold.
     The fitting functions use the probability weighted  moment method,
     if method 'method="pwm"' was selected, and the  the general
     purpose optimization function 'optim' when the  maximum likelihood
     estimation, 'method="mle"' or 'method="ml"'  is chosen. 

     *Methods:* 

      'print.gpd', 'plot.gpd' and 'summary.gpd' are print,  plot, and
     summary methods for a fitted object of class 'gpdFit'.  The plot
     method provides four different plots for assessing fitted  GPD
     model.  

     *gpd* Functions:* 

      'gpdqPlot' calculates quantile estimates and confidence intervals
      for high quantiles above the threshold in a GPD analysis, and
     adds a  graphical representation to an existing plot. The GPD
     approximation in  the tail is used to estimate quantile. The
     '"wald"' method uses  the observed Fisher information matrix to
     calculate confidence interval.  The '"likelihood"' method
     reparametrizes the likelihood in terms  of the unknown quantile
     and uses profile likelihood arguments to  construct a confidence
     interval.  

     'gpdquantPlot' creates a plot showing how the estimate of a  high
     quantile in the tail of a dataset based on the GPD approximation 
     varies with threshold or number of extremes. For every model 
     'gpdFit' is called. Evaluation may be slow. Confidence intervals 
     by the Wald method may be fastest. 

     'gpdriskmeasures' makes a rapid calculation of point estimates  of
     prescribed quantiles and expected shortfalls using the output of
     the function 'gpdFit'. This function simply calculates point
     estimates  and (at present) makes no attempt to calculate
     confidence intervals for  the risk measures. If confidence levels
     are required use 'gpdqPlot'  and 'gpdsfallPlot' which interact
     with graphs of the tail of a loss distribution and are much
     slower.   

     'gpdsfallPlot' calculates expected shortfall estimates, in other
     words tail conditional expectation and confidence intervals for
     high   quantiles above the threshold in a GPD analysis. A
     graphical  representation to an existing plot is added. Expected
     shortfall is  the expected size of the loss, given that a
     particular quantile of the  loss distribution is exceeded. The GPD
     approximation in the tail is used  to estimate expected shortfall.
     The likelihood is reparametrised  in  terms of the unknown
     expected shortfall and profile likelihood arguments  are used to
     construct a confidence interval.  

     'gpdshapePlot' creates a plot showing how the estimate of shape 
     varies with threshold or number of extremes. For every model 
     'gpdFit' is called. Evaluation may be slow.   

     'gpdtailPlot' produces a plot of the tail of the underlying 
     distribution of the data.

_V_a_l_u_e:

     'gpdSim' 
      returns a vector of datapoints from the simulated  series.

     'gpdFit'  
      returns an object of class '"gpd"' describing the  fit including
     parameter estimates and standard errors. 

     'gpdquantPlot' 
      returns invisible a table of results.

     'gpdshapePlot' 
      returns invisible a table of results.

     'gpdtailPlot' 
      returns invisible a list object containing  details of the plot
     is returned invisibly. This object should be  used as the first
     argument of 'gpdqPlot' or 'gpdsfallPlot'  to add quantile
     estimates or expected shortfall estimates to the  plot.

_A_u_t_h_o_r(_s):

     This function is based on Alec Stephenson's R-package 'evir' 
     ported from the 'EVIS' library, _Extreme Values in S_, written by
     Alexander McNeil. The 'fExtremes' port and the  change and
     addition of some functions were done by Diethelm Wuertz.

_R_e_f_e_r_e_n_c_e_s:

     Hosking J.R.M., Wallis J.R., (1987); _Parameter and quantile
     estimation for the generalized Pareto distribution_,   
     Technometrics 29, 339-349.

_E_x_a_m_p_l_e_s:

     ## SOURCE("fExtremes.53B-GpdFit")

     ## Load Data:
        data(danish)
        
     ## gpdSim - 
        # Simulate GPD Data:
        xmpExtremes("\nStart: Simulate a GPD Distributed Sample > ")
        x = gpdSim(model = list(shape = 0.25, location = 0, scale = 1), n = 1000)
        
     ## gpdFit -   
        xmpExtremes("\nNext: Fit Simulated Data to GPD using PWM > ")
        fit = gpdFit(x, nextremes = length(x), type = "pwm") 
        print(fit)
        par(mfcol = c(4, 2), cex = 0.7)
        summary(fit)  
        
     ## gpdFit -
        xmpExtremes("\nNext: Fit Simulated Data to GPD using MLE > ")
        fit = gpdFit(x, nextremes = length(x), type = "mle") 
        print(fit)
        summary(fit) 

     ## gpdFit - 
        xmpExtremes("\nNext: Fit Danish Fire Data to Excess Losses over 10 > ")
        fit = gpdFit(danish, 10, type = "mle") 
        print(fit)
        par(mfrow = c(2, 2), cex = 0.7)
        summary(fit)  
          
     ## gpdqPlot - 
        xmpExtremes("\nNext: 99.5th Percentiles for Danish Fire Data > ")
        fit = gpdFit(danish, threshold = 10, type = "mle")
        par(mfrow = c(1, 1))
        tail = gpdtailPlot(fit)
        gpdqPlot(tail, 0.995)
        title(main = "Danish Data: 99.5th Percentile") 
             
     ## gpdquantPlot -       
        xmpExtremes("\nNext: 99.9th Percentiles for Danish Fire Data > ")
        par(mfrow = c(1, 1))
        gpdquantPlot(danish, p = 0.999)
        title(sub = "Danish Fire: GPD High Quantile") 

     ## gpdsfallPlot - 
        xmpExtremes("\nNext: Expected Shortfall for Danish Fire Data > ")
        fit = gpdFit(danish, nextremes = floor(length(danish)/10), type = "mle")
        par(mfrow = c(1, 1))
        tp = gpdtailPlot(fit)
        gpdsfallPlot(tp, 0.999)   
        title(main = "Danish Fire: Expected Shortfall")   
        
     ## gpdriskmeasures -
        xmpExtremes("\nNext: Quantiles and Expected Shortfalls > ")
        # Give estimates of 0.999 and 0.9999 quantiles - Danish Fire Date:   
        fit = gpdFit(danish, threshold = 10, type = "mle") 
        par(mfrow = c(1, 1))
        gpdriskmeasures(fit, c(0.99, 0.995, 0.999, 0.9995, 0.9999))   
       
     ## gpdshapePlot - 
        xmpExtremes("\nNext: Shape Plot of Heavy-Tailed Simulated Data > ")
        set.seed(4711)
        par(mfrow = c(1, 1))
        gpdshapePlot(gpdSim(n = 1000))
        title(sub = "Simulated GPD", cex.sub = 0.7)   

     ## gpdshapePlot - 
        xmpExtremes("\nNext: Shape Plot of Heavy-Tailed Danish Fire Data > ")
        par(mfrow = c(1, 1))
        gpdshapePlot(danish)
        title(sub = "Danish Fire", cex.sub = 0.7)

     ## gpdtailPlot - 
        xmpExtremes("\nNext: Plot Tail Estimate of Danish Fire Data >")
        fit = gpdFit(danish, threshold = 10, type = "mle")
        par(mfrow = c(1, 1))
        gpdtailPlot(fit, main = "Danish Fire: GPD Tail Estimate", col = "steelblue4")  

