确定 sklearn 中 SVM 分类器的最大贡献特征 [英] Determining the most contributing features for SVM classifier in sklearn

查看:51
本文介绍了确定 sklearn 中 SVM 分类器的最大贡献特征的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,我想根据该数据训练我的模型.训练后,我需要知道在 SVM 分类器的分类中起主要作用的特征.

森林算法有一种叫做特征重要性的东西,有没有类似的?

解决方案

是的,SVM 分类器有 coef_ 属性,但它只适用于具有 线性核 的 SVM.对于其他内核,这是不可能的,因为数据通过内核方法转换到另一个与输入空间无关的空间,请查看

I have a dataset and I want to train my model on that data. After training, I need to know the features that are major contributors in the classification for a SVM classifier.

There is something called feature importance for forest algorithms, is there anything similar?

解决方案

Yes, there is attribute coef_ for SVM classifier but it only works for SVM with linear kernel. For other kernels it is not possible because data are transformed by kernel method to another space, which is not related to input space, check the explanation.

from matplotlib import pyplot as plt
from sklearn import svm

def f_importances(coef, names):
    imp = coef
    imp,names = zip(*sorted(zip(imp,names)))
    plt.barh(range(len(names)), imp, align='center')
    plt.yticks(range(len(names)), names)
    plt.show()

features_names = ['input1', 'input2']
svm = svm.SVC(kernel='linear')
svm.fit(X, Y)
f_importances(svm.coef_, features_names)

And the output of the function looks like this:

这篇关于确定 sklearn 中 SVM 分类器的最大贡献特征的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆