scikit-learn在树结构中的每个叶子节点的决策标签都存放在哪里? [英] Where does scikit-learn hold the decision labels of each leaf node in its tree structure?
问题描述
我已经使用scikit-learn训练了随机森林模型,现在我想将其树结构保存在文本文件中,以便可以在其他地方使用它。
根据此链接,一个树对象由一个并行数组的数量,每个数组包含有关树的不同节点的一些信息(例如,左子节点,右子节点,其检查的功能...)。但是,似乎没有有关与每个叶节点相对应的类标签的信息!
I have trained a random forest model using scikit-learn and now I want to save its tree structures in a text file so I can use it elsewhere. According to this link a tree object consist of a number of parallel arrays each one hold some information about different nodes of the tree (ex. left child, right child, what feature it examines,...) . However there seems to be no information about the class label corresponding to each leaf node! It's even not mentioned in the examples provided in the link above.
有人知道scikit-learn决策树结构中存储的类标签在哪里吗?
Does anyone know where are the class labels stored in the scikit-learn decision tree structure?
推荐答案
看看 sklearn.tree.DecisionTreeClassifier.tree_.value
:
from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier(random_state=0)
iris = load_iris()
clf.fit(iris.data, iris.target)
print(clf.classes_)
[0, 1, 2]
print(clf.tree_.value)
[[[ 50. 50. 50.]]
[[ 50. 0. 0.]]
[[ 0. 50. 50.]]
[[ 0. 49. 5.]]
[[ 0. 47. 1.]]
[[ 0. 47. 0.]]
[[ 0. 0. 1.]]
[[ 0. 2. 4.]]
[[ 0. 0. 3.]]
[[ 0. 2. 1.]]
[[ 0. 2. 0.]]
[[ 0. 0. 1.]]
[[ 0. 1. 45.]]
[[ 0. 1. 2.]]
[[ 0. 0. 2.]]
[[ 0. 1. 0.]]
[[ 0. 0. 43.]]]
clf.tree_.value $中的每一行c $ c>包含每个节点的恒定预测值(
help(clf.tree _)
),它对应于索引索引到 clf.classes _
。
Each row in clf.tree_.value
"contains the constant prediction value of each node," (help(clf.tree_)
) which corresponds index-to-index to clf.classes_
.
请参见此答案(很少)获得更多详细信息。
See this answer for (barely) more details.
这篇关于scikit-learn在树结构中的每个叶子节点的决策标签都存放在哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!