c_hat               package:AICcmodavg               R Documentation

_C_o_m_p_u_t_e _E_s_t_i_m_a_t_e _o_f _D_i_s_p_e_r_s_i_o_n _f_o_r _P_o_i_s_s_o_n _a_n_d _B_i_n_o_m_i_a_l _G_L_M'_s

_D_e_s_c_r_i_p_t_i_o_n:

     This function computes an estimate of c-hat for binomial or
     Poisson GLM's based on Pearson's chi-square divided by the
     residual degrees of freedom.

_U_s_a_g_e:

     c_hat(mod)

_A_r_g_u_m_e_n_t_s:

     mod: the model for which a c-hat estimate is required. 

_D_e_t_a_i_l_s:

     Poisson and binomial GLM's do not have a parameter for the
     variance and it is usually held fixed to 1 (i.e., mean =
     variance).  However, one must check whether this assumption is
     appropriate by estimating the overdispersion parameter (c-hat). 
     Though one can obtain an estimate of c-hat by dividing the
     residual deviance by the residual degrees of freedom, McCullagh
     and Nelder (1989) recommend using Pearson's chi-square divided by
     the residual degrees of freedom, which performs better.  The
     latter is the method implemented by 'c_hat'.

     Note that values of c-hat > 1 indicate overdispersion (variance >
     mean), but that values much higher than 1 (i.e., > 4) probably
     indicate lack-of-fit.  In cases of moderate overdispersion, one
     usually multiplies the variance-covariance matrix of the estimates
     by c-hat.  As a result, the SE's of the estimates are inflated
     (c-hat is also known as a variance inflation factor). 

     In model selection, c-hat should be estimated from the global
     model of the candidate model set and the same value of c-hat
     applied to the entire model set.  Specifically, a global model is
     the most complex model from which all the other models of the set
     are simpler versions (nested).  When no single global model exists
     in the set of models  considered, such as when sample size does
     not allow a complex model, one can estimate c-hat from 'subglobal'
     models.  Here, 'subglobal' models denote models from which only a
     subset of the models of the candidate set can be derived.  In such
     cases, one can use the smallest value of c-hat for model selection
     (Burnham and Anderson 2002).

     Note that c-hat counts as an additional parameter estimated and
     should be added to K.  All functions in package 'AICcmodavg'
     automatically add 1 when the 'c.hat' argument > 1 and apply the
     same value of c-hat for the entire model set.  When c-hat > 1,
     functions compute quasi-likelihood information criteria (either
     QAICc or QAIC, depending on the value of the 'second.ord'
     argument) by scaling the log-likelihood of the model by c-hat. 
     The value of c-hat can influence the ranking of the models:  as
     c-hat increases, QAIC or QAICc will favor models with fewer
     parameters.  As an additional check against this potential
     problem, on can create generate several model selection tables by
     incrementing values of c-hat to assess the model selection
     uncertainty.  If ranking changes little up to the c-hat value
     observed, one can be confident in making inference.

     In cases of underdispersion (c-hat < 1), it is recommended to keep
     the value of c-hat to 1.  However, note that values of c-hat << 1
     can also indicate lack-of-fit and that an alternative model (and
     distribution) should be investigated. 

     Note that it is only possible to estimate c-hat for binomial
     models with trials > 1 (i.e., success/trial or cbind(success,
     failure) syntax) or with Poisson GLM's.

_V_a_l_u_e:

     'c_hat' returns the estimated c-hat value

_A_u_t_h_o_r(_s):

     Marc J. Mazerolle

_R_e_f_e_r_e_n_c_e_s:

     Anderson, D. R. (2008) _Model-based Inference in the Life
     Sciences: a primer on evidence_. Springer: New York. 

     Burnham, K. P., Anderson, D. R. (2002) _Model Selection and
     Multimodel Inference: a practical information-theoretic approach_.
     Second edition. Springer: New York.

     Burnham, K. P., Anderson, D. R. (2004) Multimodel inference:
     understanding AIC and BIC in model selection. _Sociological
     Methods and Research_ *33*, 261-304. 

     Mazerolle, M. J. (2006) Improving data analysis in herpetology:
     using Akaike's Information Criterion (AIC) to assess the strength
     of biological hypotheses. _Amphibia-Reptilia_ *27*, 169-180.

     McCullagh, P., Nelder, J. A. (1989) _Generalized Linear Models_.
     Second edition. Chapman and Hall: New York.

_S_e_e _A_l_s_o:

     'AICc', 'confset', 'evidence', 'modavg', 'importance',
     'modavgpred'

_E_x_a_m_p_l_e_s:

     #binomial glm example
     set.seed(seed = 10)
     resp<-rbinom(n = 60, size = 1, prob = 0.5)
     set.seed(seed = 10)
     treat<-as.factor(sample(c(rep(x = "m", times = 30), rep(x = "f",
     times = 30))))
     age <- as.factor(c(rep("young", 20), rep("med", 20), rep("old", 20)))
     #each invidual has its own response (n = 1)
     mod1 <- glm(resp ~ treat + age, family = binomial)
     ## Not run: 
     c_hat(mod1) #gives an error because model not appropriate for
     ##computation of c-hat
     ## End(Not run)

     ##computing table to summarize successes
     table(resp, treat, age)
     dat2 <- as.data.frame(table(resp, treat, age)) #not quite what we need
     data2 <- data.frame(success = c(9, 4, 2, 3, 5, 2), sex = c("f", "m",
     "f", "m", "f", "m"),  age = c("med", "med", "old", "old", "young",
     "young"), total = c(13, 7, 10, 10, 7, 13))
     data2$prop <- data2$success/data2$total
     data2$fail <- data2$total - data2$success

     ##run model with success/total syntax using weights argument
     mod2 <- glm(prop ~ sex + age, family = binomial, weights = total,
     data = data2)
     c_hat(mod2)

     ##run model with other syntax cbind(success, fail)
     mod3 <- glm(cbind(success, fail) ~ sex + age, family = binomial,
     data = data2) 
     c_hat(mod3)

