ExtremesData            package:fExtremes            R Documentation

_E_x_p_l_o_r_a_t_i_v_e _D_a_t_a _A_n_a_l_y_s_i_s

_D_e_s_c_r_i_p_t_i_o_n:

     A collection and description of functions for  explorative data
     analysis. The tools include  plot functions for emprical
     distributions, quantile  plots, graphs exploring the properties of
     exceedences  over a threshold, plots for mean/sum ratio and for 
     the development of records. 

     The functions are:

       'emdPlot'            Plot of empirical distribution function,
       'qqparetoPlot'       Exponential/Pareto quantile plot,
       'mePlot'             Plot of mean excesses over a threshold,
       'mrlPlot'            another variant, mean residual life plot,
       'mxfPlot'            another variant, with confidence intervals,
       'msratioPlot'        Plot of the ratio of maximum and sum,
       'recordsPlot'        Record development compared with iid data,
       'ssrecordsPlot'      another variant, investigates subsamples,
       'sllnPlot'           verifies Kolmogorov's strong law of large numbers,
       'lilPlot'            verifies Hartman-Wintner's law of the iterated logarithm,
       'xacfPlot'           ACF of exceedences over a threshold,
       'normMeanExcessFit'  fits mean excesses with a normal density,
       'ghMeanExcessFit'    fits mean excesses with a GH density,
       'hypMeanExcessFit'   fits mean excesses with a HYP density,
       'nigMeanExcessFit'   fits mean excesses with a NIG density,
       'ghtMeanExcessFit'   fits mean excesses with a GHT density.

_U_s_a_g_e:

     emdPlot(x, doplot = TRUE, plottype = c("xy", "x", "y", " "), 
         labels = TRUE, ...)

     qqparetoPlot(x, xi = 0, trim = NULL, threshold = NULL, doplot = TRUE, 
         labels = TRUE, ...)

     mePlot(x, doplot = TRUE, labels = TRUE, ...)
     mrlPlot(x, ci = 0.95, umin = mean(x), umax = max(x), nint = 100, doplot = TRUE, 
          plottype = c("autoscale", ""), labels = TRUE, ...)  
     mxfPlot(x, u = quantile(x, 0.05), doplot = TRUE, labels = TRUE, ...)  
        
     msratioPlot(x, p = 1:4, doplot = TRUE, labels = TRUE, ...) 
        
     recordsPlot(x, ci = 0.95, doplot = TRUE, labels = TRUE, ...)
     ssrecordsPlot(x, subsamples = 10, doplot = TRUE, plottype = c("lin", "log"),
         labels = TRUE, ...)
         
     sllnPlot(x, doplot = TRUE, labels = TRUE, ...)
     lilPlot(x, doplot = TRUE, labels = TRUE, ...)

     xacfPlot(x, u = quantile(x, 0.95), lag.max = 15, doplot = TRUE, 
         which = c("all", 1, 2, 3, 4), labels = TRUE, ...)
         
     normMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...)
     ghMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...)
     hypMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...)
     nigMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...)
     ghtMeanExcessFit(x, doplot = TRUE, trace = TRUE, ...)

_A_r_g_u_m_e_n_t_s:

      ci: [recordsPlot] - 
           a confidence level. By default 0.95, i.e. 95%. 

  doplot: a logical value. Should the results be plotted? By  default
          'TRUE'. 

  labels: a logical value. Whether or not x- and y-axes should be
          automatically  labelled and a default main title should be
          added to the plot. By default 'TRUE'. 

 lag.max: [xacfPlot] - 
           maximum number of lags at which to calculate the
          autocorrelation  functions. The default value is 15. 

    nint: [mrlPlot] - 
           the number of intervals, see 'umin' and 'umax'. The  default
          value is 100. 

       p: [msratioPlot] - 
           the power exponents, a numeric vector. By default a sequence
          from   1 to 4 in unit integer steps. 

plottype: [emdPlot] - 
           which axes should be on a log scale: '"x"' x-axis only; 
          '"y"' y-axis only; '"xy"' both axes; '""'  neither axis. 
           [msratioPlot] - 
           a logical, if set to '"autoscale"', then the scale of the 
          plots are automatically determined, any other string allows
          user specified scale information through the '...' argument. 
           [ssrecordsPlot] - 
           one from two options can be select either '"lin"' or
          '"log"'. The default creates a linear plot. 

subsamples: [ssrecordsPlot] - 
           the number of subsamples, by default 10, an integer value. 

threshold, trim: [qPlot][xacfPlot] - 
           a numeric value at which data are to be left-truncated,
          value  at which data are to be right-truncated or the
          thresold value,  by default 95%. 

   trace: a logical flag, by default 'TRUE'. Should the calculations   
            be traced? 

       u: a numeric value at which level the data are to be truncated.
          By  default the threshold value which belongs to the 95%
          quantile, 'u=quantile(x,0.95)'.        

umin, umax: [mrlPlot] - 
           range of threshold values. If 'umin' and/or 'umax' are  not
          available, then by default they are set to the following 
          values: 'umin=mean(x)' and 'umax=max(x)'. 

   which: [xacfPlot] - 
           a numeric or character value, if 'which="all"' then all four
          plots are displayed, if 'which' is an integer between one and
          four, then the first, second, third or fourth plot will be
          displayed. 

    x, y: numeric data vectors or in the case of x an object to be
          plotted.   

      xi: the shape parameter of the generalized Pareto distribution. 

     ...: additional arguments passed to the FUN or plot function. 

_D_e_t_a_i_l_s:

     *Empirical Distribution Function:* 

      The function 'emdPlot' is a simple explanatory function. A 
     straight line on the double log scale indicates Pareto tail
     behaviour. 

     *Quantile-Quantile Pareto Plot:* 

            'qqparetoPlot' creates a quantile-quantile plot for
     threshold  data. If 'xi' is zero the reference distribution is the
      exponential; if 'xi' is non-zero the reference distribution  is
     the generalized Pareto with that parameter value expressed  by
     'xi'. In the case of the exponential, the plot is  interpreted as
     follows: Concave departures from a straight line are a  sign of
     heavy-tailed behaviour, convex departures show thin-tailed 
     behaviour.  

     *Mean Excess Function Plot:* 

      Three variants to plot the mean excess function are available:  A
     sample mean excess plot over increasing thresholds, and two mean 
     excess function plots with confidence intervals for discrimination
      in the tails of a distribution. In general, an upward trend in a
     mean excess function plot shows  heavy-tailed behaviour. In
     particular, a straight line with positive  gradient above some
     threshold is a sign of Pareto behaviour in tail.  A downward trend
     shows thin-tailed behaviour whereas a line with  zero gradient
     shows an exponential tail. Here are some hints: Because upper
     plotting points are the average of a handful of extreme  excesses,
     these may be omitted for a prettier plot.  For 'mrlPlot' and
     'mxfPlot' the upper tail is investigated;  for the lower tail
     reverse the sign of the 'data' vector. 

     *Plot of the Maximum/Sum Ratio:* 

      The ratio of maximum and sum is a simple tool for detecting heavy
      tails of a distribution and for giving a rough estimate of the
     order of its finite moments. Sharp increases in the curves of a
     'msratioPlot' are a sign for heavy tail behaviour. 

     *Plot of the Development of Records:* 

      These are functions that investigate the development of records
     in  a dataset and calculate the expected behaviour for iid data.
     'recordsPlot' counts records and reports the observations  at
     which they occur. In addition subsamples can be investigated with
     the help of the function 'ssrecordsPlot'. 

     *Plot of Kolmogorov's and Hartman-Wintern's Laws:* 

      The function 'sllnPlot' verifies Kolmogorov's strong law of 
     large numbers, and the function 'lilPlot' verifies 
     Hartman-Wintner's law of the iterated logarithm. 

     *ACF Plot of Exceedences over a Thresold:* 

      This function plots the autocorrelation functions of heights and 
     distances of exceedences over a threshold. 

_V_a_l_u_e:

     The functions return a plot.

_N_o_t_e:

     The plots are labeled by default with a x-label, a y-label and a
     main title. If the argument 'labels' is set to 'FALSE' neither a
     x-label, a y-label nor a main title will be added to the graph. To
     add user defined label strings just use the  function
     'title(xlab="...", ylab="...", main="...")'.

_A_u_t_h_o_r(_s):

     Some of the functions were implemented from Alec Stephenson's 
     R-package 'evir' ported from Alexander McNeil's S library  'EVIS',
     _Extreme Values in S_, some from Alec Stephenson's  R-package
     'ismev' based on Stuart Coles code from his book,  _Introduction
     to Statistical Modeling of Extreme Values_ and  some were written
     by Diethelm Wuertz.

_R_e_f_e_r_e_n_c_e_s:

     Coles S. (2001); _Introduction to Statistical Modelling of Extreme
     Values_, Springer.

     Embrechts, P., Klueppelberg, C., Mikosch, T. (1997); _Modelling
     Extremal Events_, Springer.

_E_x_a_m_p_l_e_s:

      
     ## Danish fire insurance data:
        data(danishClaims)
        danishClaims = as.timeSeries(danishClaims)
        
     ## emdPlot -
        # Show Pareto tail behaviour:
        par(mfrow = c(2, 2), cex = 0.7)
        emdPlot(danishClaims) 
        
     ## qqparetoPlot -
        # QQ-Plot of heavy-tailed Danish fire insurance data:
        qqparetoPlot(danishClaims, xi = 0.7) 
      
     ## mePlot -
        # Sample mean excess plot of heavy-tailed Danish fire:
        mePlot(danishClaims)
           
     ## ssrecordsPlot -
        # Record fire insurance losses in Denmark:
        ssrecordsPlot(danishClaims, subsamples = 10) 

