如何在 xgboost 中获得特征重要性? [英] How to get feature importance in xgboost?
本文介绍了如何在 xgboost 中获得特征重要性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用 xgboost 构建模型,并尝试使用 get_fscore()
找出每个功能的重要性,但它返回 {}
I'm using xgboost to build a model, and try to find the importance of each feature using get_fscore()
, but it returns {}
我的火车代码是:
dtrain = xgb.DMatrix(X, label=Y)
watchlist = [(dtrain, 'train')]
param = {'max_depth': 6, 'learning_rate': 0.03}
num_round = 200
bst = xgb.train(param, dtrain, num_round, watchlist)
那么我的火车有什么错误吗?如何在 xgboost 中获取特征重要性?
So is there any mistake in my train? How to get feature importance in xgboost?
推荐答案
在您的代码中,您可以以 dict 形式获取每个功能的功能重要性:
In your code you can get feature importance for each feature in dict form:
bst.get_score(importance_type='gain')
>>{'ftr_col1': 77.21064539577829,
'ftr_col2': 10.28690566363971,
'ftr_col3': 24.225014841466294,
'ftr_col4': 11.234086283060112}
说明:train() API 的 get_score() 方法定义为:
Explanation: The train() API's method get_score() is defined as:
get_score(fmap='',importance_type='weight')
get_score(fmap='', importance_type='weight')
- fmap(str(可选))– 特征映射文件的名称.
- importance_type
- ‘权重’ - 一个特征被用来在所有树上分割数据的次数.
- ‘增益’ - 使用该特征的所有分割的平均增益.
- ‘cover’ - 使用该特征的所有分割的平均覆盖率.
- ‘total_gain’ - 使用该特征的所有分割的总增益.
- ‘total_cover’ - 使用该特征的所有分割的总覆盖率.
- fmap (str (optional)) – The name of feature map file.
- importance_type
- ‘weight’ - the number of times a feature is used to split the data across all trees.
- ‘gain’ - the average gain across all splits the feature is used in.
- ‘cover’ - the average coverage across all splits the feature is used in.
- ‘total_gain’ - the total gain across all splits the feature is used in.
- ‘total_cover’ - the total coverage across all splits the feature is used in.
https://xgboost.readthedocs.io/en/latest/python/python_api.html
这篇关于如何在 xgboost 中获得特征重要性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文