了解 RandomForestRegressor 中的 max_features 参数 [英] Understanding max_features parameter in RandomForestRegressor

查看:192
本文介绍了了解 RandomForestRegressor 中的 max_features 参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用自举样本构建随机森林中的每棵树时,对于每个终端节点,我们从 p 个变量中随机选择 m 个变量以找到最佳分割(p 是数据中的特征总数).我的问题(对于 RandomForestRegressor)是:

While constructing each tree in the random forest using bootstrapped samples, for each terminal node, we select m variables at random from p variables to find the best split (p is the total number of features in your data). My questions (for RandomForestRegressor) are:

1) max_features 对应什么(m 或 p 或其他)?

1) What does max_features correspond to (m or p or something else)?

2) 是否从 max_features 个变量中随机选择了 m 个变量(m 的值是多少)?

2) Are m variables selected at random from max_features variables (what is the value of m)?

3) 如果 max_features 对应于 m,那么我为什么要将它设置为等于 p 进行回归(默认值)?这个设置的随机性在哪里(即它与装袋有什么不同)?

3) If max_features corresponds to m, then why would I want to set it equal to p for regression (the default)? Where is the randomness with this setting (i.e., how is it different from bagging)?

谢谢.

推荐答案

直接来自 文档:

[max_features] 是分割节点时要考虑的特征随机子集的大小.

[max_features] is the size of the random subsets of features to consider when splitting a node.

所以 max_features 就是你所说的 m.当 max_features="auto", m = p 并且树中没有进行特征子集选择时,所以随机森林"实际上是普通回归树的袋装集合.文档继续说

So max_features is what you call m. When max_features="auto", m = p and no feature subset selection is performed in the trees, so the "random forest" is actually a bagged ensemble of ordinary regression trees. The docs go on to say that

经验好的默认值是 max_features=n_features 用于回归问题,max_features=sqrt(n_features) 用于分类任务

Empirical good default values are max_features=n_features for regression problems, and max_features=sqrt(n_features) for classification tasks

通过不同地设置 max_features,您将获得真正的"随机森林.

By setting max_features differently, you'll get a "true" random forest.

这篇关于了解 RandomForestRegressor 中的 max_features 参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆