如何在 XGBoost 库的 plot_tree 函数中包含特征名称? [英] How do I include feature names in the plot_tree function from the XGBoost library?

查看:71
本文介绍了如何在 XGBoost 库的 plot_tree 函数中包含特征名称?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用 XGBoost 库来开发二进制分类模型.训练我的模型后,我对可视化单个树感兴趣,以更好地理解我的模型预测.

I've been using the XGBoost library to develop a binary classification model. Having trained my model I am interested in visualizing the individual trees to better understand my models predictions.

为此,XGBoost 提供了一个 plot_tree函数,但它只显示特征的整数索引.这是我的一棵树的示例:

To do this XGBoost provides a plot_tree function but it only shows the integer index of the feature. Here is an example of one of my trees:

如何在此图像中包含特征名称而不是特征索引 (f28)?

How do I include the feature name in this image rather than feature index (f28)?

推荐答案

xgboost 中的 plot_tree 函数有一个参数 fmap,它是特征图"的路径文件;这包含特征索引到特征名称的映射.

The plot_tree function in xgboost has an argument fmap which is a path to a 'feature map' file; this contains a mapping of the feature index to feature name.

关于特征图文件的文档很少,但它是一个制表符分隔的文件,其中第一列是特征索引(从 0 开始,以特征数量结束),第二列是特征名称和最后一列显示特征类型的指标(q=定量特征,i=二元特征).

The documentation on the feature map file is sparse, but it is a tab-delimited file where the first column is the feature indices (starting from 0 and ending at the number of features), the second column the feature name and the final column an indicator showing the type of feature (q=quantitative feature, i=binary feature).

feature_map.txt 文件示例:

0    feature_name_0    q
1    feature_name_1    i
2    feature_name_2    q
…          …           … 

使用这个制表符分隔的文件,您可以从训练有素的模型实例中绘制您的树:

With this tab-delimited file you can then plot your tree from your trained model instance:

import xgboost
model = xgboost.XGBClassifier()

# train the model
model.fit(X, y)

# plot the decision tree, providing path to feature map file

xgboost.plot_tree(model,  num_trees=0, fmap='feature_map.txt')

使用此函数显示绘图:

这篇关于如何在 XGBoost 库的 plot_tree 函数中包含特征名称?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆