| cumulative {VGAM} | R Documentation |
Fits a cumulative logit/probit/cloglog/cauchit/... regression model to an ordered (preferably) factor response.
cumulative(link = "logit", earg = list(),
parallel = FALSE, reverse = FALSE,
mv = FALSE, intercept.apply = FALSE)
In the following, the response Y is assumed to be a factor with ordered values 1,2,...,M+1, so that M is the number of linear/additive predictors eta_j.
link |
Link function applied to the M cumulative probabilities.
See Links for more choices.
|
earg |
List. Extra argument for the link function.
See earg in Links for general information.
|
parallel |
A logical, or formula specifying which terms have
equal/unequal coefficients.
|
reverse |
Logical.
By default, the cumulative probabilities used are
P(Y<=1), P(Y<=2),
..., P(Y<=M).
If reverse is TRUE, then
P(Y>=2), P(Y>=3), ...,
P(Y>=M+1) will be used.
This should be set to TRUE for link=
golf,
polf,
nbolf.
For these links the cutpoints must be an increasing sequence;
if reverse=FALSE for then the cutpoints must be an decreasing sequence.
|
mv |
Logical.
Multivariate response? If TRUE then the input should be
a matrix with values 1,2,...,L, where L is the
number of levels.
Each column of the matrix is a response, i.e., multivariate response.
A suitable matrix can be obtained from Cut.
|
intercept.apply |
Logical.
Whether the parallel argument should be applied to the intercept term.
This should be set to TRUE for link=
golf,
polf,
nbolf.
|
By default, the non-parallel cumulative logit model is fitted, i.e.,
eta_j = logit(P[Y<=j])
where j=1,2,...,M and
the eta_j are not constrained to be parallel.
This is also known as the non-proportional odds model.
If the logit link is replaced by a complementary log-log link
(cloglog) then
this is known as the proportional-hazards model.
In almost all the literature, the constraint matrices associated
with this family of models are known. For example, setting
parallel=TRUE will make all constraint matrices (except for
the intercept) equal to a vector of M 1's.
If the constraint matrices are equal, unknown and to be estimated, then
this can be achieved by fitting the model as a
reduced-rank vector generalized
linear model (RR-VGLM; see rrvglm).
Currently, reduced-rank vector generalized additive models
(RR-VGAMs) have not been implemented here.
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm,
rrvglm
and vgam.
No check is made to verify that the response is ordinal;
see ordered.
The response should be either a matrix of counts (with row sums that
are all positive), or a factor. In both cases, the y slot
returned by vglm/vgam/rrvglm is the matrix
of counts.
For a nominal (unordered) factor response, the multinomial
logit model (multinomial) is more appropriate.
With the logit link, setting parallel=TRUE will fit a
proportional odds model. Note that the TRUE here does
not apply to the intercept term.
In practice, the validity of the proportional odds
assumption needs to be checked, e.g., by a likelihood ratio test.
If acceptable on the data,
then numerical problems are less likely to occur during the fitting,
and there are less parameters. Numerical problems occur when
the linear/additive predictors cross, which results in probabilities
outside of (0,1); setting parallel=TRUE will help avoid
this problem.
Here is an example of the usage of the parallel argument.
If there are covariates x1, x2 and x3, then
parallel = TRUE ~ x1 + x2 -1 and
parallel = FALSE ~ x3 are equivalent. This would constrain
the regression coefficients for x1 and x2 to be
equal; those of the intercepts and x3 would be different.
In the future, this family function may be renamed to
``cups'' (for cumulative probabilities)
or ``cute'' (for cumulative probabilities).
Thomas W. Yee
Agresti, A. (2002) Categorical Data Analysis, 2nd ed. New York: Wiley.
Dobson, A. J. (2001) An Introduction to Generalized Linear Models, 2nd ed. Boca Raton: Chapman & Hall/CRC Press.
McCullagh, P. and Nelder, J. A. (1989) Generalized Linear Models, 2nd ed. London: Chapman & Hall.
Simonoff, J. S. (2003) Analyzing Categorical Data, New York: Springer-Verlag.
Yee, T. W. and Wild, C. J. (1996) Vector generalized additive models. Journal of the Royal Statistical Society, Series B, Methodological, 58, 481–493.
Documentation accompanying the VGAM package at http://www.stat.auckland.ac.nz/~yee contains further information and examples.
acat,
cratio,
sratio,
multinomial,
pneumo,
logit,
probit,
cloglog,
cauchit,
golf,
polf,
nbolf.
# Fit the proportional odds model, p.179, in McCullagh and Nelder (1989)
data(pneumo)
pneumo = transform(pneumo, let=log(exposure.time))
(fit = vglm(cbind(normal, mild, severe) ~ let,
cumulative(parallel=TRUE, reverse=TRUE), pneumo))
fit@y # Sample proportions
weights(fit, type="prior") # Number of observations
coef(fit, matrix=TRUE)
constraints(fit) # Constraint matrices
# Check that the model is linear in let
fit2 = vgam(cbind(normal, mild, severe) ~ s(let, df=2),
cumulative(reverse=TRUE), pneumo)
## Not run:
plot(fit2, se=TRUE, overlay=TRUE, lcol=1:2, scol=1:2)
## End(Not run)
# Check the proportional odds assumption with a likelihood ratio test
(fit3 = vglm(cbind(normal, mild, severe) ~ let,
cumulative(parallel=FALSE, reverse=TRUE), pneumo))
1 - pchisq(2*(logLik(fit3)-logLik(fit)),
df=length(coef(fit3))-length(coef(fit)))