如何从 scikit-learn 解释决策树 [英] how to explain the decision tree from scikit-learn

查看:24
本文介绍了如何从 scikit-learn 解释决策树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在理解 scikit-learn 决策树的结果时有两个问题.例如,这是我的决策树之一:

I have two problems with understanding the result of decision tree from scikit-learn. For example, this is one of my decision trees:

我的问题是我如何使用这棵树?

My question is that how I can use the tree?

第一个问题是:如果一个样本满足条件,那么它去LEFT分支(如果存在),否则它去RIGHT.就我而言,如果 X[7] > 63521.3984 的样本.然后样品将进入绿色框.正确吗?

The first question is that: if a sample satisfied the condition, then it goes to the LEFT branch (if exists), otherwise it goes RIGHT. In my case, if a sample with X[7] > 63521.3984. Then the sample will go to the green box. Correct?

第二个问题是:当一个样本到达叶子节点时,我如何知道它属于哪个类别?在这个例子中,我有三个类别要分类.在红色框中,分别有 91、212 和 113 个样本满足条件.但我如何决定类别?我知道有一个函数 clf.predict(sample) 可以告诉类别.我可以从图表中做到这一点吗???非常感谢.

The second question is that: when a sample reaches the leaf node, how can I know which category it belongs? In this example, I have three categories to classify. In the red box, there are 91, 212, and 113 samples are satisfied the condition, respectively. But how can I decide the category? I know there is a function clf.predict(sample) to tell the category. Can I do that from the graph??? Many thanks.

推荐答案

每个框中的 value 行告诉您该节点有多少样本按顺序归入每个类别.这就是为什么在每个框中,value 中的数字与 sample 中显示的数字相加.例如,在您的红色框中,91+212+113=416.所以这意味着如果你到达这个节点,类别 1 中有 91 个数据点,类别 2 中有 212 个数据点,类别 3 中有 113 个数据点.

The value line in each box is telling you how many samples at that node fall into each category, in order. That's why, in each box, the numbers in value add up to the number shown in sample. For instance, in your red box, 91+212+113=416. So this means if you reach this node, there were 91 data points in category 1, 212 in category 2, and 113 in category 3.

如果您要预测到达决策树中该叶子的新数据点的结果,您将预测类别 2,因为这是该节点上样本最常见的类别.

If you were going to predict the outcome for a new data point that reached that leaf in the decision tree, you would predict category 2, because that is the most common category for samples at that node.

这篇关于如何从 scikit-learn 解释决策树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆