随机森林中置换重要性 [英] permutation importance in h2o random Forest

查看:347
本文介绍了随机森林中置换重要性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

随机森林的CRAN实现提供了两种可变的重要性度量:基尼重要性以及定义为

The CRAN implementation of random forests offers both variable importance measures: the Gini importance as well as the widely used permutation importance defined as

对于分类,是指案例出现的次数增加的百分比 OOB,并且在排列变量时分类错误.为了回归, 它是当变量变大时OOB残差平方的平均增加 被排列

For classification, it is the increase in percent of times a case is OOB and misclassified when the variable is permuted. For regression, it is the average increase in squared OOB residuals when the variable is permuted

默认情况下, h2o.varimp()仅计算前者.在h2o中,真的没有任何选择可以从随机森林模型中获得替代措施吗?

By default h2o.varimp() computes only the former. Is there really no option in h2o to get the alternative measure out of a random forest model?

谢谢! ML

推荐答案

H2O不计算置换的重要性.请参阅文档解释如何计算变量重要性.

H2O does not calculate permutation importance. Please see the documentation for the explanation of how variable importance is calculated.

为方便起见,我还将其粘贴在下面:

For your convenience I'll paste it as well below:

如何为DRF计算变量重要性?

How is variable importance calculated for DRF?

变量的重要性是通过计算每个变量的相对影响来确定的:在树的构建过程中,是否在拆分过程中选择了该变量,结果平方误差(所有树)得到了改善.

先前已针对此问题提出了功能请求,您可以在此处(尽管请注意,它目前处于打开状态).

A feature request has been previously made for this issue, you can follow it here (though note it is currently open).

这篇关于随机森林中置换重要性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆