使用scikit-learn时,如何查找树拆分的属性? [英] How do I find which attributes my tree splits on, when using scikit-learn?

查看:225
本文介绍了使用scikit-learn时,如何查找树拆分的属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在探索scikit-learn,制作具有熵和基尼分裂准则的决策树,并探索其中的差异.

我的问题是,如何才能打开引擎盖"并准确找出树木在每个级别上划分的属性及其相关的信息值,以便我可以看到这两个标准在何处做出不同的选择? /p>

到目前为止,我已经探索了文档中概述的9种方法.他们似乎不允许访问此信息.但是肯定可以访问此信息吗?我正在构想一个列表或字典,其中包含用于节点和增益的条目.

如果我错过了显而易见的事情,谢谢您的帮助和歉意.

解决方案

直接从文档( http://scikit-learn.org/0.12/modules/tree.html ):

from io import StringIO
out = StringIO()
out = tree.export_graphviz(clf, out_file=out)

Python3不再支持

StringIO模块,而是导入io模块.

决策树对象中还有一个tree_属性,该属性允许直接访问整个结构.

您可以简单地阅读

clf.tree_.children_left #array of left children
clf.tree_.children_right #array of right children
clf.tree_.feature #array of nodes splitting feature
clf.tree_.threshold #array of nodes splitting points
clf.tree_.value #array of nodes values

有关更多详细信息,请参见源代码导出方法的说明

通常,您可以使用inspect模块

from inspect import getmembers
print( getmembers( clf.tree_ ) )

获取对象的所有元素

I have been exploring scikit-learn, making decision trees with both entropy and gini splitting criteria, and exploring the differences.

My question, is how can I "open the hood" and find out exactly which attributes the trees are splitting on at each level, along with their associated information values, so I can see where the two criterion make different choices?

So far, I have explored the 9 methods outlined in the documentation. They don't appear to allow access to this information. But surely this information is accessible? I'm envisioning a list or dict that has entries for node and gain.

Thanks for your help and my apologies if I've missed something completely obvious.

解决方案

Directly from the documentation ( http://scikit-learn.org/0.12/modules/tree.html ):

from io import StringIO
out = StringIO()
out = tree.export_graphviz(clf, out_file=out)

StringIO module is no longer supported in Python3, instead import io module.

There is also the tree_ attribute in your decision tree object, which allows the direct access to the whole structure.

And you can simply read it

clf.tree_.children_left #array of left children
clf.tree_.children_right #array of right children
clf.tree_.feature #array of nodes splitting feature
clf.tree_.threshold #array of nodes splitting points
clf.tree_.value #array of nodes values

for more details look at the source code of export method

In general you can use the inspect module

from inspect import getmembers
print( getmembers( clf.tree_ ) )

to get all the object's elements

这篇关于使用scikit-learn时,如何查找树拆分的属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆