gssanova {gss}R Documentation

Fitting Smoothing Spline ANOVA Models with Non Gaussian Responses

Description

gssanova fits smoothing spline ANOVA models with cubic spline, linear spline, or thin-plate spline marginals to responses from selected exponential families. The symbolic model specification via formula follows the same rule as in lm and glm.

Usage

gssanova(formula, family, type="cubic", data=list(), weights, subset,
        offset, na.action=na.omit, partial=NULL, method=NULL,
        varht=1, prec=1e-7, maxiter=30, ext=.05, order=2)

Arguments

formula a symbolic description of the model to be fit.
family a description of the error distribution. Supported are "binomial", "poisson", "Gamma", "inverse.gaussian", and "nbinomial".
type the type of marginals to be used. Supported currently are type="cubic" for cubic spline marginals, type="linear" for linear spline marginals, and type="tp" for thin-plate spline marginals.
data an optional data frame containing the variables in the model.
weights an optional vector of weights to be used in the fitting process.
subset an optional vector specifying a subset of observations to be used in the fitting process.
offset an optional offset term with known parameter 1.
na.action a function which indicates what should happen when the data contain NAs.
partial optional extra fixed effect terms in partial spline models.
method the score used to drive the performance-oriented iteration. Supported are method="v" for GCV, method="m" for type-II ML, and method="u" for Mallow's CL.
varht an external variance estimate needed for method="u". It is ignored when method="v" or method="m" are specified.
prec the precision in the fit required to stop the iteration for multiple smoothing parameter selection. It is ignored when only one smoothing parameter is involved.
maxiter the maximum number of iterations allowed for performance-oriented iteration, and for inner-loop multiple smoothing parameter selection when applicable.
ext for cubic spline and linear spline marginals, this option specifies how far to extend the domain beyond the minimum and the maximum as a percentage of the range. The default ext=.05 specifies marginal domains of lengths 110 percent of their respective ranges. Prediction outside of the domain will result in an error. It is ignored if type="tp" is specified.
order for thin-plate spline marginals, this option specifies the order of the marginal penalties. It is ignored if type="cubic" or type="linear" are specified.

Details

The models are fitted by penalized likelihood method through the performance-oriented iteration, as described in the reference cited below.

Only one link is implemented for each family. It is the logit link for family="binomial", and the log link for "poisson", "Gamma", and "inverse.gaussian". For "nbinomial", the working parameter is the logit of the probability p, which is proportional to the reciprocal of the mean.

For family="binomial", "poisson", and "nbinomial", the score driving the performance-oriented iteration defaults to method="u" with varht=1. For family="Gamma" and "inverse.gaussian", the default is method="v".

See ssanova for details and notes concerning smoothing spline ANOVA models.

Value

gssanova returns a list object of class "gssanova" which inherits from the class "ssanova".

The method summary is used to obtain summaries of the fits. The method predict can be used to evaluate the fits at arbitrary points, along with the standard errors to be used in Bayesian confidence intervals, both on the scale of the link. The methods residuals and fitted.values extract the respective traits from the fits.

Note

For family="binomial", the responses can be specified either as two columns of counts or as a column of sample proportion plus a column of weights, the same as in glm.

For family="nbinomial", the responses may be specified as two columns with the second being the known sizes, or simply a single column with the common unknown size to be estimated by ML.

Author(s)

Chong Gu, chong@stat.purdue.edu

References

Gu, C. (1992), "Cross-validating non Gaussian data," Journal of Computational and Graphical Statistics, 1, 169-179.

See Also

predict.ssanova for predictions and summary.gssanova for summaries.

Examples

## Fit a cubic smoothing spline logistic regression model
test <- function(x)
        {.3*(1e6*(x^11*(1-x)^6)+1e4*(x^3*(1-x)^10))-2}
x <- (0:100)/100
p <- 1-1/(1+exp(test(x)))
y <- rbinom(x,3,p)
logit.fit <- gssanova(cbind(y,3-y)~x,family="binomial")

## The same fit
logit.fit1 <- gssanova(y/3~x,"binomial",weights=rep(3,101))

## Obtain estimates and standard errors on a grid
est <- predict(logit.fit,data.frame(x=x),se=TRUE)

## Plot the fit and the Bayesian confidence intervals
plot(x,y/3,ylab="p")
lines(x,p,col=1)
lines(x,1-1/(1+exp(est$fit)),col=2)
lines(x,1-1/(1+exp(est$fit+1.96*est$se)),col=3)
lines(x,1-1/(1+exp(est$fit-1.96*est$se)),col=3)

[Package Contents]