ssanova                 package:gss                 R Documentation

_F_i_t_t_i_n_g _S_m_o_o_t_h_i_n_g _S_p_l_i_n_e _A_N_O_V_A _M_o_d_e_l_s

_D_e_s_c_r_i_p_t_i_o_n:

     Fit smoothing spline ANOVA models in Gaussian regression.  The
     symbolic model specification via 'formula' follows the same rules
     as in 'lm'.

_U_s_a_g_e:

     ssanova(formula, type=NULL, data=list(), weights, subset, offset,
             na.action=na.omit, partial=NULL, method="v", alpha=1.4,
             varht=1, id.basis=NULL, nbasis=NULL, seed=NULL, random=NULL)

_A_r_g_u_m_e_n_t_s:

 formula: Symbolic description of the model to be fit.

    type: List specifying the type of spline for each variable. See
          'mkterm' for details.

    data: Optional data frame containing the variables in the model.

 weights: Optional vector of weights to be used in the fitting process.

  subset: Optional vector specifying a subset of observations to be
          used in the fitting process.

  offset: Optional offset term with known parameter 1.

na.action: Function which indicates what should happen when the data
          contain NAs.

 partial: Optional extra unpenalized terms in partial spline models.

  method: Method for smoothing parameter selection.  Supported are
          'method="v"' for GCV, 'method="m"' for GML (REML), and
          'method="u"' for Mallows' CL.

   alpha: Parameter modifying GCV or Mallows' CL; larger absolute
          values yield smoother fits; negative value invokes a stable
          and more accurate GCV/CL evaluation algorithm but may take
          two to five times as long.  Ignored when 'method="m"' are
          specified.

   varht: External variance estimate needed for 'method="u"'.  Ignored
          when 'method="v"' or 'method="m"' are specified.

id.basis: Index designating selected "knots".

  nbasis: Number of "knots" to be selected.  Ignored when 'id.basis' is
          supplied.

    seed: Seed to be used for the random generation of "knots". Ignored
          when 'id.basis' is supplied.

  random: Input for parametric random effects in nonparametric
          mixed-effect models.  See 'mkran' for details.

_D_e_t_a_i_l_s:

     The model specification via 'formula' is intuitive.  For example,
     'y~x1*x2' yields a model of the form

          y = C + f_{1}(x1) + f_{2}(x2) + f_{12}(x1,x2) + e

     with the terms denoted by '"1"', '"x1"', '"x2"', and '"x1:x2"'.

     The model terms are sums of unpenalized and penalized terms.
     Attached to every penalized term there is a smoothing parameter,
     and the model complexity is largely determined by the number of
     smoothing parameters.

     A subset of the observations are selected as "knots."  Unless
     specified via 'id.basis' or 'nbasis', the number of "knots" q is
     determined by max(30,10n^{2/9}), which is appropriate for the
     default cubic splines for numerical vectors.

     Using q "knots," 'ssanova' calculates an approximate solution to
     the penalized least squares problem using algorithms of the order
     O(nq^{2}), which for q<<n scale better than the O(n^{3})
     algorithms of 'ssanova0'.  For the exact solution, one may set q=n
     in 'ssanova', but 'ssanova0' would be much faster.

_V_a_l_u_e:

     'ssanova' returns a list object of class '"ssanova"'.

     The method 'summary.ssanova' can be used to obtain summaries of
     the fits.  The method 'predict.ssanova' can be used to evaluate
     the fits at arbitrary points along with standard errors.  The
     method 'project.ssanova' can be used to calculate the
     Kullback-Leibler projection for model selection.  The methods
     'residuals.ssanova' and 'fitted.ssanova' extract the respective
     traits from the fits.

_N_o_t_e:

     To use GCV and Mallows' CL unmodified, set 'alpha=1'.

     For simpler models and moderate sample sizes, the exact solution
     of 'ssanova0' can be faster.

     The results may vary from run to run. For consistency, specify
     'id.basis' or set 'seed'.

     In _gss_ versions earlier than 1.0, 'ssanova' was under the name
     'ssanova1'.

_A_u_t_h_o_r(_s):

     Chong Gu, chong@stat.purdue.edu

_R_e_f_e_r_e_n_c_e_s:

     Gu, C. (2002), _Smoothing Spline ANOVA Models_.  New York:
     Springer-Verlag.

     Kim, Y.-J. and Gu, C. (2004), Smoothing spline Gaussian
     regression: more scalable computation via efficient approximation.
     _Journal of the Royal Statistical Society, Ser. B_, *66*, 337-356.

     Wahba, G. (1990), _Spline Models for Observational Data_.
     Philadelphia: SIAM.

_E_x_a_m_p_l_e_s:

     ## Fit a cubic spline
     x <- runif(100); y <- 5 + 3*sin(2*pi*x) + rnorm(x)
     cubic.fit <- ssanova(y~x)
     ## Obtain estimates and standard errors on a grid
     new <- data.frame(x=seq(min(x),max(x),len=50))
     est <- predict(cubic.fit,new,se=TRUE)
     ## Plot the fit and the Bayesian confidence intervals
     plot(x,y,col=1); lines(new$x,est$fit,col=2)
     lines(new$x,est$fit+1.96*est$se,col=3)
     lines(new$x,est$fit-1.96*est$se,col=3)
     ## Clean up
     ## Not run: 
     rm(x,y,cubic.fit,new,est)
     dev.off()
     ## End(Not run)

     ## Fit a tensor product cubic spline
     data(nox)
     nox.fit <- ssanova(log10(nox)~comp*equi,data=nox)
     ## Fit a spline with cubic and nominal marginals
     nox$comp<-as.factor(nox$comp)
     nox.fit.n <- ssanova(log10(nox)~comp*equi,data=nox)
     ## Fit a spline with cubic and ordinal marginals
     nox$comp<-as.ordered(nox$comp)
     nox.fit.o <- ssanova(log10(nox)~comp*equi,data=nox)
     ## Clean up
     ## Not run: rm(nox,nox.fit,nox.fit.n,nox.fit.o)

