更改使用导出graphviz创建的决策树图的颜色 [英] Changing colors for decision tree plot created using export graphviz
问题描述
我正在使用scikit的回归树函数和graphviz生成一些决策树的奇妙且易于解释的视觉效果:
I am using scikit's regression tree function and graphviz to generate the wonderful, easy to interpret visuals of some decision trees:
dot_data = tree.export_graphviz(Run.reg, out_file=None,
feature_names=Xvar,
filled=True, rounded=True,
special_characters=True)
graph = pydotplus.graph_from_dot_data(dot_data)
graph.write_png('CART.png')
graph.write_svg("CART.svg")
这运行得很好,但是我想更改颜色计划是否可能?该图表示CO 2 通量,因此我想将负值设为绿色,将正值设为棕色。我可以将其导出为svg并手动更改所有内容,但是当我这样做时,文本与框并不太一致,因此手动更改颜色并修复所有文本会给我的工作流程增加一个非常繁琐的步骤,我非常希望避免!
This runs perfectly, but I'd like to change the color scheme if possible? The plot represents CO2 fluxes, so I'd like to make the negative values green and positive brown. I can export as svg instead and alter everything manually, but when I do, the text doesn't quite line up with the boxes so changing the colors manually and fixing all the text adds a very tedious step to my workflow that I would really like to avoid!
此外,我已经看到一些树,其中连接节点的线的长度与%方差成比例,由分裂。
Also, I've seen some trees where the length of the lines connecting nodes is proportional to the % variance explained by the split. I'd love to be able to do that too if possible?
推荐答案
- 我也希望能够做到这一点。通过
graph.get_edge_list()
- 所有边的列表每个源节点都应有两个目标节点,其中一个具有较低的索引被评估为True,较高的索引为False
- 可以通过
set_fillcolor()
- You can get a list of all the edges via
graph.get_edge_list()
- Each source node should have two target nodes, the one with the lower index is evaluated as True, the higher index as False
- Colors can be assigned via
set_fillcolor()
分配颜色
import pydotplus
from sklearn.datasets import load_iris
from sklearn import tree
import collections
clf = tree.DecisionTreeClassifier(random_state=42)
iris = load_iris()
clf = clf.fit(iris.data, iris.target)
dot_data = tree.export_graphviz(clf,
feature_names=iris.feature_names,
out_file=None,
filled=True,
rounded=True)
graph = pydotplus.graph_from_dot_data(dot_data)
colors = ('brown', 'forestgreen')
edges = collections.defaultdict(list)
for edge in graph.get_edge_list():
edges[edge.get_source()].append(int(edge.get_destination()))
for edge in edges:
edges[edge].sort()
for i in range(2):
dest = graph.get_node(str(edges[edge][i]))[0]
dest.set_fillcolor(colors[i])
graph.write_png('tree.png')
此外,我也看到了一些连接
节点的线的长度与拆分所解释的%方差成比例的树。如果可能的话,我也希望
能够做到这一点!?
Also, i've seen some trees where the length of the lines connecting nodes is proportional to the % varriance explained by the split. I'd love to be able to do that too if possible!?
您可以和<$ c $一起玩c> set_weight()和 set_len()
,但是这有点棘手,需要摆弄一些技巧才能正确完成,但这是一些代码
You could play with set_weight()
and set_len()
but that's a bit more tricky and needs some fiddling to get it right but here is some code to get you started.
for edge in edges:
edges[edge].sort()
src = graph.get_node(edge)[0]
total_weight = int(src.get_attributes()['label'].split('samples = ')[1].split('<br/>')[0])
for i in range(2):
dest = graph.get_node(str(edges[edge][i]))[0]
weight = int(dest.get_attributes()['label'].split('samples = ')[1].split('<br/>')[0])
graph.get_edge(edge, str(edges[edge][0]))[0].set_weight((1 - weight / total_weight) * 100)
graph.get_edge(edge, str(edges[edge][0]))[0].set_len(weight / total_weight)
graph.get_edge(edge, str(edges[edge][0]))[0].set_minlen(weight / total_weight)
这篇关于更改使用导出graphviz创建的决策树图的颜色的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!