Distribution Test Data, and log relative error functionpackage:accuracyR Documentation

_B_e_n_c_h_m_a_r_k _d_a_t_a _t_o _t_e_s_t _t_h_e _a_c_c_u_r_a_c_y _o_f _s_t_a_t_i_s_t_i_c_a_l _d_i_s_t_r_i_b_u_t_i_o_n _f_u_n_c_t_i_o_n_s, _f_u_n_c_t_i_o_n _t_o _c_o_m_p_u_t_e _l_o_g _r_e_l_a_t_i_v_e _e_r_r_o_r

_D_e_s_c_r_i_p_t_i_o_n:

     This is benchmark data, used to test the accuracy of statistical
     distribution functions. Also included is a function for computing
     log-relative error, a measure of accuracy.

_U_s_a_g_e:

             data(ttst)
             data(ftst)
             data(gammatst)
             data(normtst)
             data(chisqtst)

_D_e_t_a_i_l_s:

     These tests values are tab-delimited ascii datasets, with a
     variable header line. Typical variables are:

     P - probablilty value DF - degrees of freedom INVDIST - inv the
     inverse function value for P: i.e.: invt(P,DF) = INVT INVDIST -
     inv the inverse function value for P: i.e.: invt(P,DF) = INVT
     PINVDIST - value for cumulative distribrution of INVDIST i.e.:
     cumt(INVT,DF) = PINVT

     Note that because of limits in numerical precision,
     cumt(invt(P,DF)) is not always equal to P

     Of course, not all possible values of P and DF can be listed. The
     test values for P  were created from systematic and random samples
     in  { [1e-12,1-1e-12],0,1 }, and [0,1E5] respectively.

     We use Kneusel's ELV program (1989) to calculate the values the
     cumulative and  inverse distributions. The results are claimed by
     Knsusel to be accurate to 6 digits. We checked these results using
     Brown's (1998) DCDFLIB library. The results of both calculations
     agreed to the 6 digits supplied by ELV, but in a small number of
     cases, DCDFSTAT's calculation at the 7th digit indicated that the
     sixth digit would change if rounded. 

     Note that both ELV and DCDFLIB can generate many more
     distributions than are included here. Other resources are
     described in the references

_V_a_l_u_e:

     Returns a new vector of log-relative-errors (log absolute error
     where $c_i$ ==0). The resulting values are roughly interpretable
     as the number of significant digits of agreement between c and x.
     Larger numbers indicate that x is a relatively more accurate
     approximation of c. Numbers less than or equal to zero indicate
     complete disagreement between x and c.

_A_u_t_h_o_r(_s):

     Micah Altman Micah_Altman@harvard.edu <URL:
     http://www.hmdc.harvard.edu/micah_altman/>, Michael McDonald

_R_e_f_e_r_e_n_c_e_s:

     Altman, M., J. Gill and M. P. McDonald.  2003.  _Numerical Issues
     in Statistical Computing for the Social Scientist_.  John Wiley &
     Sons. <URL: http://www.hmdc.harvard.edu/numerical_issues/>

     Altman, Micah and Michael McDonald, 2001. "Choosing Reliable
     Statistical Software."  _PS: Political Science & Politics_ 24(3):
     681-8

     Barry W. Brown, James Lovato, Kathy Russell, 1998,   DCDFLIB,
     <URL: ftp://odin.mda.uth.tmc.edu in pub/unix/dcdflib.c-1.0-tar.Z>

     Kneusel, L. 1989. _Computergesteutzte Berechnung statistischer
     Verteilungen._ Olde nbourg, Meunchen-Wien. <URL:
     http://www.stat.uni-muenchen.de/~knuesel/elv/elv.html>

_E_x_a_m_p_l_e_s:

     # simple LRE examples
     LRE(1.001,1) # roughly 3 significant digits agreement
     LRE(1,1) # complete agreement
     LRE(20,1) # complete disagreement

     #
     # how accurate are student's t-test functions?
     #

     data(ttst)
     # compute t quantiles using benchmark data
     tqt = qt(ttst$p,ttst$df)

     # compute log-relative-error (LRE) of  qt() results, compared to 
     # correct answers

     lrq = LRE(tqt, ttst$invt);

     # if there are entries with LRE's of < 5, there may be
     # significant inaccuracies in the qt() function

     table(trunc(lrq))

     # now repeat process, for pt()

     tpt = pt(ttst$invt,ttst$df)
     lrp= LRE(tpt, ttst$pinvt);
     table(trunc(lrp))

