HOMTESTS                package:nsRFA                R Documentation

_H_o_m_o_g_e_n_e_i_t_y _t_e_s_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     Homogeneity tests for Regional Frequency Analysis.

_U_s_a_g_e:

      ADbootstrap.test (x, cod, Nsim=500, index=2)
      HW.tests (x, cod, Nsim=500)
      DK.test (x, cod)
      discordancy (x, cod)
      criticalD ()

_A_r_g_u_m_e_n_t_s:

       x: vector representing data from many samples defined with 'cod'

     cod: array that defines the data subdivision among sites

    Nsim: number of regions simulated with the bootstrap of the
          original region

   index: if 'index'=1 samples are divided by their average value; if
          'index'=2 (default) samples are divided by their median value

_D_e_t_a_i_l_s:

     *The Hosking and Wallis heterogeneity measures*

     The idea underlying Hosking and Wallis (1993) heterogeneity
     statistics is to measure the sample variability of the L-moment
     ratios and compare it to the variation that would be expected in a
     homogeneous region. The latter is estimated through repeated
     simulations of homogeneous regions with samples drawn from a four
     parameter kappa distribution (see e.g., Hosking and Wallis, 1997,
     pp. 202-204).  More in detail, the steps are the following: with
     regards to the k samples belonging to the region under analysis,
     find the sample L-moment ratios (see, Hosking and Wallis, 1997)
     pertaining to the i-th site: these are the L-coefficient of
     variation (L-CV),

 t^(i) = (1/ni sum[j from 1 to ni](2(j - 1)/(ni - 1) - 1) Y(i,j)) / (1/ni sum[j from 1 to ni] Y(i,j))

     the coefficient of L-skewness,

 t3^(i) = (1/ni sum[j from 1 to ni](6(j-1)(j-2)/(ni-1)/(ni-2) - 6(j-1)/(ni-1) + 1) Y(i,j)) / (1/ni sum[j from 1 to ni](2(j-1)/(ni-1) - 1) Y(i,j))

     and the coefficient of L-kurtosis

 t4^(i) = (1/ni sum[j from 1 to ni](20(j-1)(j-2)(j-3)/(ni-1)/(ni-2)/(ni-3) - 30(j-1)(j-2)/(ni-1)/(ni-2) + 12(j-1)/(ni-1) - 1) Y(i,j)) / (1/ni sum[j from 1 to ni](2(j-1)/(ni-1) - 1)Y(i,j))

     Note that the L-moment ratios are not affected by the
     normalization by the index value, i.e. it is the same to use
     X(i,j) or Y(i,j) in Equations.

     Define the regional averaged L-CV, L-skewness and L-kurtosis
     coefficients,

    t^R = (sum[i from 1 to k] ni t^(i)) / (sum[i from 1 to k] ni)


   t3^R = (sum[i from 1 to k] ni t3^(i)) / (sum[i from 1 to k] ni)


   t4^R = (sum[i from 1 to k] ni t4^(i)) / (sum[i from 1 to k] ni)

     and compute the statistic

 V = {sum[i from 1 to k] ni (t^(i) - t^R)^2 / sum[i from 1 to k] ni}^(1/2)


     Fit the parameters of a four-parameters kappa distribution to the
     regional averaged L-moment ratios t^R, t3^R and t4^R, and then
     generate a large number Nsim of realizations of sets of k samples.
     The i-th site sample in each set  has a kappa distribution as its
     parent and record length equal to ni. For each simulated
     homogeneous set, calculate the statistic V, obtaining Nsim values.
      On this vector of V values determine the mean muV and standard
     deviation sigmaV that relate to the hypothesis of homogeneity
     (actually, under the composite hypothesis of homogeneity and kappa
     parent distribution).

     An heterogeneity measure, which is called here HW1, is finally
     found as

                   theta(HW1) = (V - muV)/(sigmaV)

     theta(HW1)  can be approximated by a normal distributed with zero
     mean and unit variance: following Hosking and Wallis (1997), the
     region under analysis can therefore be regarded as acceptably
     homogeneous if theta(HW1)<1, possibly heterogeneous if 1 <=
     theta(HW1) < 2, and definitely heterogeneous if theta(HW1) >= 2.
     Hosking and Wallis (1997) suggest that these limits should be
     treated as useful guidelines. Even if the theta(HW1) statistic is
     constructed like a significance test, significance levels obtained
     from such a test would in fact be accurate only under special
     assumptions: to have independent data both serially and between
     sites, and the true regional distribution being kappa.

     Hosking and Wallis (1993) also give alternative heterogeneity
     measures (that we call HW2 and HW3), in which V is replaced by:

 V2 = sum[i from 1 to k] ni {(t^(i) - t^R)^2 + (t3^(i) - t3^R)^2}^(1/2) / sum[i from 1 to k] ni

     or

 V3 = sum[i from 1 to k] ni {(t3^(i) - t3^R)^2 + (t4^(i) - t4^R)^2}^(1/2) / sum[i from 1 to k] ni

     The test statistic in this case becomes

               theta(HW2) = (V2 - mu(V2)) / (sigma(V2))

     or

               theta(HW3) = (V3 - mu(V3)) / (sigma(V3))

     with similar acceptability limits as the HW1 statistic.  Hosking
     and Wallis (1997) judge theta(HW2) and theta(HW3) to be inferior
     to theta(HW1) and say that it rarely yields values larger than 2
     even for grossly heterogeneous regions.

     *The bootstrap Anderson-Darling test*

     A test that does not make any assumption on the parent
     distribution is the Anderson-Darling (AD) rank test (Scholz and
     Stephens, 1987). The AD test is the generalization of the
     classical Anderson-Darling goodness of fit test (e.g., D'Agostino
     and Stephens, 1986), and it is used to test the hypothesis that k
     independent samples belong to the same population without
     specifying their common distribution function.

     The test is based on the comparison between local and regional
     empirical distribution functions. The empirical distribution
     function, or sample distribution function, is defined by F(x) =
     j/eta, x(j) <= x < x(j+1), where eta is the size of the sample and
     x(j) are the order statistics, i.e. the observations arranged in
     ascending order. Denote the empirical distribution function of the
     i-th sample (local) by hatFi(x), and that of the pooled sample of
     all N = n1 + ... + nk observations (regional) by HN(x). The
     k-sample Anderson-Darling test statistic is then defined as

 theta(AD) = sum[i from 1 to k] ni integral[all x] ((hatFi(x) - HN(x))^2) / (HN(x) (1 - HN(x))) dHN(x)


     If the pooled ordered sample is Z1 < ... < ZN, the computational
     formula to evaluate theta(AD) is:

 theta(AD) = 1/N sum[i from 1 to k] 1/ni sum[i from 1 to N-1] ((N M(ij) - j ni)^2) / (j(N-j))

     where M(ij) is the number of observations in the i-th sample that
     are not greater than Zj. The homogeneity test can be carried out
     by comparing the obtained theta(AD) value to the tabulated
     percentage points reported by Scholz and Stephens (1987) for
     different significance levels.

     The statistic theta(AD) depends on the sample values only through
     their ranks. This guarantees that the test statistic remains
     unchanged when the samples undergo monotonic transformations, an
     important stability property not possessed by HW heterogeneity
     measures. However, problems arise in applying this test in a
     common index value procedure. In fact, the index value procedure
     corresponds to dividing each site sample by a different value,
     thus modifying the ranks in the pooled sample. In particular, this
     has the effect of making the local empirical distribution
     functions much more similar to the other, providing an impression
     of homogeneity even when the samples are highly heterogeneous. The
     effect is analogous to that encountered when applying
     goodness-of-fit tests to distributions whose parameters are
     estimated from the same sample used for the test (e.g., D'Agostino
     and Stephens, 1986; Laio, 2004). In both cases, the percentage
     points for the test should be opportunely redetermined. This can
     be done with a nonparametric bootstrap approach presenting the
     following steps: build up the pooled sample S of the observed
     non-dimensional data. Sample with replacement from S and generate
     k artificial local samples, of size n1, ..., nk. Divide each
     sample for its index value, and calculate theta^(1)(AD). Repeat
     the procedure for Nsim times and obtain a sample of theta^(j)(AD),
     j = 1, ..., Nsim values, whose empirical distribution function can
     be used as an approximation of G(H0)(theta(AD)), the distribution
     of theta(AD) under the null hypothesis of homogeneity. The
     acceptance limits for the test, corresponding to any significance
     level alpha, are then easily determined as the quantiles of
     G(H0)(theta(AD)) corresponding to a probability (1-alpha).

     We will call the test obtained with the above procedure the
     bootstrap Anderson-Darling test, hereafter referred to as AD.

     *Durbin and Knott test*

     The last considered homogeneity test derives from a
     goodness-of-fit statistic originally proposed by Durbin and Knott
     (1971). The test is formulated to measure discrepancies in the
     dispersion of the samples, without accounting for the possible
     presence of discrepancies in the mean or skewness of the data.
     Under this aspect, the test is similar to the HW1 test, while it
     is analogous to the AD test for the fact that it is a rank test.
     The original goodness-of-fit test is very simple: suppose to have
     a sample Xi, i = 1, ..., n, with hypothetical distribution F(x);
     under the null hypothesis the random variable F(Xi) has a uniform
     distribution in the (0,1) interval, and the statistic D = sum[i
     from 1 to n] cos(2 pi F(Xi)) is approximately normally distributed
     with mean 0 and variance 1 (Durbin and Knott, 1971). D serves the
     purpose of detecting discrepancy in data dispersion: if the
     variance of Xi is greater than that of the hypothetical
     distribution F(x), D is significantly greater than 0, while D is
     significantly below 0 in the reverse case. Differences between the
     mean (or the median) of Xi and F(x) are instead not detected by D,
     which guarantees that the normalization by the index value does
     not affect the test.

     The extension to homogeneity testing of the  Durbin and Knott (DK)
     statistic is straightforward: we substitute the empirical
     distribution function obtained with the pooled observed data,
     HN(x), for F(x) in D, obtaining at each site a statistic

              Di = sum[j from 1 to ni] cos(2 pi HN(Xj))

     which is normal under the hypothesis of homogeneity. The statistic
     theta(DK) = sum[i from 1 to k] Di^2 has then a chi-squared
     distribution with k-1 degrees of freedom, which allows one to
     determine the acceptability limits for the test, corresponding to
     any significance level alpha. 

     *Comparison among tests*

     The comparison (Viglione et al, 2007) shows that the Hosking and
     Wallis heterogeneity measure HW1 (only based on L-CV) is
     preferable when skewness is low, while the bootstrap
     Anderson-Darling test should be used for more skewed regions. As
     for HW2, the Hosking and Wallis heterogeneity measure based on
     L-CV and L-CA, it is shown once more how much it lacks power.

     Our suggestion is to guide the choice of the test according to a
     compromise between power and Type I error of the HW1 and AD tests.
     The L-moment space is divided into two regions:  if the t3^R
     coefficient for the region under analysis is lower than 0.23, we
     propose to use the Hosking and Wallis heterogeneity measure HW1;
     if t3^R > 0.23, the bootstrap Anderson-Darling test is preferable.

_V_a_l_u_e:

     'ADbootstrap.test' and 'DK.test' test gives its test statistic and
     its distribution value P. If P is, for example, 0.92, samples
     shouldn't be considered heterogeneous with significance level
     minor of 8%.

     'HW.tests' gives the two Hosking and Wallis heterogeneity measures
     H_1 and H_2; following Hosking and Wallis (1997), the region under
     analysis can therefore be regarded as acceptably homogeneous if
     H_1<1, possibly heterogeneous if 1 <=q H_1 < 2, and definitely
     heterogeneous if H >=q 2.

     'discordancy' returns the discordancy measure D of Hosking and
     Wallis for all sites.  Hosking and Wallis suggest to consider a
     site discordant if D >=q 3 if N >=q 15 (where N is the number of
     sites considered in the region). For N<15 the critical values of D
     can be listed with 'criticalD'.

_N_o_t_e:

     For information on the package and the Author, and for all the
     references, see 'nsRFA'.

_S_e_e _A_l_s_o:

     'traceWminim', 'roi', 'KAPPA', 'HW.original'.

_E_x_a_m_p_l_e_s:

     data(hydroSIMN)
     annualflows
     summary(annualflows)
     x <- annualflows["dato"][,]
     cod <- annualflows["cod"][,]
     split(x,cod)

     #ADbootstrap.test(x,cod,Nsim=100)   # it takes some time
     #HW.tests(x,cod)                    # it takes some time
     DK.test(x,cod)

     fac <- factor(annualflows["cod"][,],levels=c(34:38))
     x2 <- annualflows[!is.na(fac),"dato"]
     cod2 <- annualflows[!is.na(fac),"cod"]

     ADbootstrap.test(x2,cod2,Nsim=100)
     ADbootstrap.test(x2,cod2,index=1,Nsim=200)
     HW.tests(x2,cod2,Nsim=100)
     DK.test(x2,cod2)

     discordancy(x,cod)

     criticalD()

