| gamboost {mboost} | R Documentation |
Gradient boosting for optimizing arbitrary loss functions where component-wise smoothing splines are utilized as base learners.
## S3 method for class 'formula':
gamboost(formula, data = list(), weights = NULL, ...)
## S3 method for class 'matrix':
gamboost(x, y, weights = NULL, ...)
gamboost_fit(object, baselearner = c("ssp", "bsp", "ols"),
dfbase = 4, family = GaussReg(),
control = boost_control(), weights = NULL)
formula |
a symbolic description of the model to be fit. |
data |
a data frame containing the variables in the model. |
weights |
an optional vector of weights to be used in the fitting process. |
x |
design matrix. |
y |
vector of responses. |
object |
an object of class boost_data, see boost_dpp. |
baselearner |
an character specifying the component-wise base-learner to be used:
ssp means smoothing splines, bsp B-splines (see bs
and ols means linear models. Please note that only
the characteristics of component-wise smoothing
splines have been investigated theoretically and practically until now. |
dfbase |
an integer vector giving the degrees of freedom for the smoothing spline, either globally for all variables (when its length is one) or separately for each single covariate. |
family |
an object of class boost_family-class,
implementing the negative gradient corresponding
to the loss function to be optimized, by default, squared error loss
for continuous responses is used. |
control |
an object of class boost_control. |
... |
additional arguments passed to callies. |
A (generalized) additive model is fitted using a boosting algorithm based on component-wise
univariate smoothing splines. The methodology is described in
Buhlmann and Yu (2003). If dfbase = 1, a univariate
linear model is used as base learner (resulting in a linear partial fit
for this variable).
The function gamboost_fit provides access to the fitting
procedure without data pre-processing, e.g. for cross-validation.
An object of class gamboost with print,
AIC and predict methods being available.
Peter Buhlmann and Bin Yu (2003), Boosting with the L2 loss: regression and classification. Journal of the American Statistical Association, 98, 324–339.
Peter Buhlmann and Torsten Hothorn (2007), Boosting algorithms: regularization, prediction and model fitting. Statistical Science, accepted. ftp://ftp.stat.math.ethz.ch/Research-Reports/Other-Manuscripts/buhlmann/BuehlmannHothorn_Boosting-rev.pdf
### a simple two-dimensional example: cars data
cars.gb <- gamboost(dist ~ speed, data = cars, dfbase = 4,
control = boost_control(mstop = 50))
cars.gb
AIC(cars.gb, method = "corrected")
### plot fit for mstop = 1, ..., 50
plot(dist ~ speed, data = cars)
tmp <- sapply(1:mstop(AIC(cars.gb)), function(i)
lines(cars$speed, predict(cars.gb[i]), col = "red"))
lines(cars$speed, predict(smooth.spline(cars$speed, cars$dist),
cars$speed)$y, col = "green")
### artificial example: sinus transformation
x <- sort(runif(100)) * 10
y <- sin(x) + rnorm(length(x), sd = 0.25)
plot(x, y)
### linear model
lines(x, fitted(lm(y ~ sin(x) - 1)), col = "red")
### GAM
lines(x, fitted(gamboost(y ~ x - 1,
control = boost_control(mstop = 500))),
col = "green")