Recursive Partitioning and Regression Trees
Usage
rpart(formula, data, weights, subset, na.action=na.rpart, method,
model=F, x=F, y=T, parms, control=rpart.control(...), ...)
Arguments
formula
|
a formula, as in the lm function.
|
data
|
an optional data frame in which to interpret the variables named in the
formula
|
weights
|
optional weights (currently ignored).
|
subset
|
optional expression saying that only a subset of the rows of the data
should be used in the fit.
|
na.action
|
The default action deletes all observations for which y is missing,
but keeps those in which one or more predictors are missing.
|
method
|
one of "anova" , "poisson" , "class" or "exp" .
If method is missing then the routine tries to make an intellegent guess.
If y is a survival object, then method="exp" is assumed,
if y has 2 columns then method="poisson" is assumed,
if y is a factor then method="class" is assumed, otherwise method="anova"
is assumed. It is wisest to specifiy the method directly, especially as
more criteria are added to the function.
|
model
|
keep a copy of the model frame in the result.
If the input value for model is a model frame (likely from an earlier
call to the rpart function), then this frame is used rather than constructing
new data.
|
x
|
keep a copy of the x matrix in the result.
|
y
|
keep a copy of the dependent variable in the result.
|
parms
|
optional parameters for the splitting function.
Anova splitting has no parameters.
Poisson splitting has a single parameter, the coefficient of variation of
the prior distribution on the rates. The default value is 1.
Exponential splitting has the same parameter as Poisson.
For classification splitting, the list can contain any of:
the vector of prior probabilities (component prior ), the loss matrix
(component loss ) or the splitting index (component split ). The
priors must be positive and sum to 1. The loss matrix must have zeros
on the diagnoal and positive off-diagonal elements. The splitting
index can be gini or information . The default priors are
proportional to the data counts, the losses default to 1,
and the split defaults to gini.
|
control
|
options that control details of the rpart algorithm.
|
...
|
arguments to rpart.control may also be specified in the call to rpart .
|
Description
Fit a rpart
modelDetails
This differs from the tree
function mainly in its handling of surrogate
variables.Value
an object of class rpart
, a superset of class tree
.References
Breiman, Friedman, Olshen, and Stone. (1984)
Classification and Regression Trees.
Wadsworth.See Also
rpart.control
, rpart.object
, tree
, summary.rpart
, print.rpart
Examples
data(kyphosis)
fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
fit2 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis,
parms=list(prior=c(.65,.35), split='information'))
fit3 <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis,
control=rpart.control(cp=.05))
par(mfrow=c(1,2))
plot(fit)
text(fit,use.n=T)
plot(fit2)
text(fit2,use.n=T)