插入符:使用随机森林并包含交叉验证 [英] caret: using random forest and include cross-validation

查看:268
本文介绍了插入符:使用随机森林并包含交叉验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用插入符号包训练了一个随机森林,包括反复进行交叉验证。我想知道是否使用了Breiman最初的RF中的OOB,还是将其替换为交叉验证。如果将其替换,我是否具有与Breiman 2001中所述相同的优点,例如通过减少输入数据之间的相关性来提高准确性?由于OOB是用替换绘制的,而CV是不替换绘制的,这两个过程是否具有可比性? OOB估计的错误率是多少(基于CV)?

I used the caret package to train a random forest, including repeated cross-validation. I’d like to know whether the OOB, as in the original RF by Breiman, is used or whether this is replaced by the cross-validation. If it is replaced, do I have the same advantages as described in Breiman 2001, like increased accuracy by reducing the correlation between input data? As OOB is drawn with replacement and CV is drawn without replacement, are both procedures comparable? What is the OOB estimate of error rate (based on CV)?

树木如何生长?

这是我的第一个话题,如果您需要更多详细信息,请告诉我。

As this is my first thread, please let me know if you need more details. Many thanks in advance.

推荐答案

这里有很多基本问题,通过在计算机上读书可以更好地解决问题学习或预测模型。这就是为什么您没有得到太多回应的原因。

There are a lot of basic questions here and you would be better served by reading a book on machine learning or predictive modeling. Thats probably why you haven't gotten much of a response.

对于插入符,您还应该查阅打包网站,其中回答了一些问题。

For caret you should also consult the package website where some of these questions are answered.

这里有一些注释:


  • RF的CV和OOB估计有些不同。 这篇文章可能有助于解释操作方法。对于此应用程序,在构建模型时将计算来自随机森林的OOB速率,而CV使用在计算随机森林模型之后预测的保留样本。

  • 原始随机森林模型(在此处使用)使用未修剪的CART树。同样,这在许多教科书和论文中都有。

  • CV and OOB estimation for RF are somewhat different. This post might help explain how. For this application, the OOB rate from random forest is computed while the model is being build whereas CV uses holdout samples that are predicted after the random forest model is computed.
  • The original random forest model (used here) uses unpruned CART trees. Again, this is in many text books and papers.

Max

这篇关于插入符:使用随机森林并包含交叉验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆