R中的决策树错误:fit不是树,只是树的根 [英] decision tree in R error:fit is not a tree,just a root

查看:432
本文介绍了R中的决策树错误:fit不是树,只是树的根的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下午好!
我对决策树有疑问。

  f11< -as.factor(Z24train $ f1)
fit_f1<-rpart(f11〜TSU + TSL + TW + TP,data = Z24train,method = class)
图(fit_f1,Uniform = TRUE,main =驼背分类树)

但是出现此错误:

  plot.rpart中的错误( fit_f1,uniform = TRUE,main =脊椎后凸分类树):
fit不是树,只是根

这是什么问题?
感谢您的帮助:)

解决方案

这可能是由于 RPART 无法使用给定数据集创建决策树。

  rpart.control(minsplit = 20,minbucket = round(minsplit / 3),cp = 0.01,
maxcompete = 4,maxsurrogate = 5,usesurrogate = 2,xval = 10,
surrogatestyle = 0,maxdepth = 30,...)

如果要创建树,则可以调整控制参数并创建过拟合树。

 树<-rpart(f11〜TSU + TSL + TW + TP,data = Z24train,method = class,control = rpart.control(minsplit = 1,minbucket = 1,cp = 0))

参数说明取自r文档
https://stat.ethz.ch/R-manual/R-devel/library/rpart /html/rpart.control.html



最小分裂

必须遵守的最小观察数



minbucket

obser的最小数量任何终端节点中的版本。如果仅指定了minbucket或minsplit之一,则代码会根据需要将minsplit设置为minbucket * 3或将minbucket设置为minsplit / 3。



cp

复杂度参数。不会尝试进行任何不会将整体拟合不足程度降低cp的拆分。例如,使用方差分解,这意味着整个R平方必须在每一步增加cp。此参数的主要作用是通过删除显然不值得的拆分来节省计算时间。从本质上讲,用户通知程序,任何不能通过cp改进拟合度的拆分都可能会通过交叉验证而被删减,因此程序无需继续使用


good afternoon! I have problem with a decisional trees.

f11<-as.factor(Z24train$f1)
fit_f1 <- rpart(f11~TSU+TSL+TW+TP,data = Z24train,method="class")
plot(fit_f1, uniform=TRUE, main="Classification Tree for Kyphosis")

But this error appears:

Error in plot.rpart(fit_f1, uniform = TRUE, main = "Classification Tree for Kyphosis") : 
  fit is not a tree, just a root

which is the problem? thanks for the help :)

解决方案

This is probably due to RPART is not being able to create a decision tree with the given data set after using it's default control parameters.

rpart.control(minsplit = 20, minbucket = round(minsplit/3), cp = 0.01, 
maxcompete = 4, maxsurrogate = 5, usesurrogate = 2, xval = 10,
surrogatestyle = 0, maxdepth = 30, ...)

If you want to create the tree you can adjust control parameters and create an over fitting tree.

tree <- rpart(f11~TSU+TSL+TW+TP,data = Z24train,method="class",control =rpart.control(minsplit =1,minbucket=1, cp=0))

Parameter description taken from r documentation (https://stat.ethz.ch/R-manual/R-devel/library/rpart/html/rpart.control.html)

minsplit
the minimum number of observations that must exist in a node in order for a split to be attempted.

minbucket
the minimum number of observations in any terminal node. If only one of minbucket or minsplit is specified, the code either sets minsplit to minbucket*3 or minbucket to minsplit/3, as appropriate.

cp
complexity parameter. Any split that does not decrease the overall lack of fit by a factor of cp is not attempted. For instance, with anova splitting, this means that the overall R-squared must increase by cp at each step. The main role of this parameter is to save computing time by pruning off splits that are obviously not worthwhile. Essentially,the user informs the program that any split which does not improve the fit by cp will likely be pruned off by cross-validation, and that hence the program need not pursue it

这篇关于R中的决策树错误:fit不是树,只是树的根的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆