如何使用整个训练示例来估计sklearn RandomForest中的班级概率 [英] How to use whole training example to estimate class probabilities in sklearn RandomForest

查看：125 发布时间：2020/5/4 10:07:23 machine-learning scikit-learn random-forest

本文介绍了如何使用整个训练示例来估计sklearn RandomForest中的班级概率的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

经过课程的事先培训，我想使用scikit-learn RandomForestClassifier来估计给定示例属于一组类的概率.

I want to use scikit-learn RandomForestClassifier to estimate the probabilities of a given example to belong to a set of classes, after prior training of course.

我知道我可以使用 predict_proba 方法，将其计算为

I know I can get the class probabilities using the predict_proba method, that calculates them as

[...]森林中树木的平均预测类别概率.

[...] the mean predicted class probabilities of the trees in the forest.

此问题中提到:

一棵树返回的概率是归一化的类样本落入的叶子的直方图.

The probabilities returned by a single tree are the normalized class histograms of the leaf a sample lands in.

现在，我一直在阅读一些有关概率估计的论文，并且意识到没有简单的解决方案.根据估算随机森林(Böstrom)中的类别概率:

Now, I've been reading some papers on probability estimation and realized there isn't a trivial solution. According to Estimating Class Probabilities in Random Forests (Böstrom):

使用相同的示例来种植树木并估算树木的概率，必然会导致纯净(因此小)估计集

using the same examples to both grow the trees and estimate the probabilities, [...] by necessity will lead to pure (and therefore small) estimation sets

这很糟糕.解决方案似乎是使用训练集中的所有示例，而不是仅使用用于生成树的引导程序示例中的示例.

And this is bad. The solution appears to be to use all the examples in the training set, instead of only the ones in the bootstrap sample used to grow the tree.

Scikit-learn确实仅对每棵树使用引导程序样本来计算每个类别的概率估计，对吗? 有人对如何使课堂概率来自于RandomForest的整个训练集有任何指示吗?

Scikit-learn does use only the bootstrap sample for each tree to calculate the probability estimate of each class, right? Does somebody have any pointers about how to proceed to make the class probabilities come from the whole training set of the RandomForest instead?

我认为这将需要一些特殊的Tree子类，该子类不会将类概率分配给树的叶子，然后需要一些过程使用整个训练集从RandomForest分类器中分配它们.

I assume this would need some special Tree subclassing that doesn't assign class probabilities to the leaves of the trees and then some procedure to assign them from the RandomForest classifier using the whole training set.

如何使用整个训练示例来估计sklearn RandomForest中的班级概率 [英] How to use whole training example to estimate class probabilities in sklearn RandomForest

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

如何使用整个训练示例来估计sklearn RandomForest中的班级概率 [英] How to use whole training example to estimate class probabilities in sklearn RandomForest

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭