calibration.plot       package:PresenceAbsence       R Documentation

_C_a_l_i_b_r_a_t_i_o_n _P_l_o_t

_D_e_s_c_r_i_p_t_i_o_n:

     'calibration.plot' produces a goodness-of-fit plot for
     Presence/Absence data.

_U_s_a_g_e:

     calibration.plot(DATA, which.model = 1, na.rm = FALSE, alpha = 0.05, N.bins = 5, xlab = "Predicted Probability of Occurrence", ylab = "Observed Occurrence as Proportion of Sites Surveyed", main = NULL, color= NULL, model.names= NULL)

_A_r_g_u_m_e_n_t_s:

    DATA: a matrix or dataframe of observed and predicted values where
          each row represents one plot and where columns are:

                  'DATA[,1]'  plot ID                                            text
                  'DATA[,2]'  observed values                                    zero-one values
                  'DATA[,3]'  predicted probabilities from first model           numeric (between 0 and 1)
                  'DATA[,4]'  predicted probabilities from second model, etc...  

which.model: a number indicating which model from 'DATA' should be used

   na.rm: a logical indicating whether missing values should be removed

   alpha: alpha value for confidence intervals 

  N.bins: number of bins to split predicted probabilities into 

    xlab: a title for the x axis

    ylab: a title for the y axis

    main: an overall title for the plot 

   color: a logical or a vector of color codes 

model.names: a vector of the names of each model included in 'DATA'

_D_e_t_a_i_l_s:

     Takes a single model and creates a goodness-of-fit plot of
     observed verses predicted values. The plots are grouped into bins
     based on their predicted values, and then the bin prevalence (the
     ratio of plots in this bin with observed values of present verses
     the total number of plots in this bin) is calculated for each bin.
     The confidence interval for each bin is also plotted, and the
     total number of plots is labeled above each the bin.

     Confidence intervals are calculated for the binomial bin counts
     using the F distribution.

     Unlike a typical goodness-of-fit plot from a linear regression
     model, with Presence/Absence data having all the points lay along
     the diagonal does not necessarily imply a good quality model. The
     ideal calibration plot for Presence/Absence data depends on the
     intended use of the model.

     If the model is to be used to produce probability maps, then it is
     indeed desirable that (for example) 80 percent of plots with
     predicted probability of 0.8 actually do have observed Presence.
     In this case, having all the bins along the diagonal does indicate
     a good model.

     However, if model is to be used simply to predict species
     presence, then all that is required is that some threshold exists
     (not necessarily 0.5) where every plot with a lower predicted
     probability is observed Absent, and every plot with a higher
     predicted probability is observed Present. In this case, a good
     model will not necessarily (in fact, will rarely) have all the
     bins along the diagonal. (Note: for this purpose
     'presence.absence.hist' may produce more useful diagnostics.)

     If all the bins lie above the diagonal, or all the bins lie below
     the diagonal, it may indicate that the training and test datasets
     have different prevalence. In this case, it may be worthwhile to
     re-examine the initial data selection.

_V_a_l_u_e:

     creates a graphical plot

_N_o_t_e:

_A_u_t_h_o_r(_s):

     Elizabeth Freeman eafreeman@fs.fed.us

_R_e_f_e_r_e_n_c_e_s:

     Vaughan, I. P., Ormerod, S. J., The continuing challenges of
     testing species distribution models.  J. Appl. Ecol., 42:720-730.

_S_e_e _A_l_s_o:

     presence.absence.summary, presence.absence.hist

_E_x_a_m_p_l_e_s:

     data(SIM3DATA)

     calibration.plot(SIM3DATA)

     calibration.plot(       DATA=SIM3DATA,
                             which.model=3,
                             na.rm=TRUE,
                             alpha=0.05,
                             N.bins=10,
                             xlab="Predicted Probability of Occurence",
                             ylab="Observed occurence as proportion of sites surveyed",
                             model.names=NULL,
                             main=NULL)

