如何获得特定预测的逻辑回归特征的相对重要性? [英] How can I get the relative importance of features of a logistic regression for a particular prediction?

查看:850
本文介绍了如何获得特定预测的逻辑回归特征的相对重要性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Logistic回归(在scikit中)来解决二进制分类问题,并且对能够解释每个单独的预测感兴趣.更准确地说,我对预测阳性类别的可能性感兴趣,并希望衡量每种功能对该预测的重要性.

I am using a Logistic Regression (in scikit) for a binary classification problem, and am interested in being able to explain each individual prediction. To be more precise, I'm interested in predicting the probability of the positive class, and having a measure of the importance of each feature for that prediction.

使用系数(Beta)作为重要程度通常是一个坏主意如此处回答,但我还没有找到一个好的选择.

Using the coefficients (Betas) as a measure of importance is generally a bad idea as answered here, but I'm yet to find a good alternative.

到目前为止,我发现的最好的是以下3个选项:

So far the best I have found are the following 3 options:

  1. Monte Carlo Option :修复所有其他功能,然后重新运行预测,并用训练集中的随机样本替换我们要评估的功能.多次执行此操作.这将为阳性类别建立基线概率.然后将其与原始运行的正类概率 进行比较.区别在于特征的重要性.
  2. 一劳永逸"分类器:要评估某个功能的重要性,请先创建一个使用所有功能的模型,然后再创建一个使用所有功能的模型,但所测试的功能除外.使用这两种模型预测新的观测值.两者之间的区别在于该功能的重要性.
  3. 调整后的测试版:基于
  1. Monte Carlo Option: Fixing all other features, re-run the prediction replacing the feature we want to evaluate with random samples from the training set. Do this a large number of times. This would establish a baseline probability for the positive class. Then compare with the probability of the positive class of the original run. The difference is a measure of Importance of the feature.
  2. "Leave-one-out" classifiers: To evaluate the importance of a feature, first create a model which uses all features, and then another that uses all features except the one being tested. Predict the new observation using both models. The difference between the two would be the importance of the feature.
  3. Adjusted betas: Based on this answer, ranking the importance of the features by 'the magnitude of its coefficient times the standard deviation of the corresponding parameter in the data.'

对我来说,所有选项(使用Beta,Monte Carlo和"Leave-one-out")似乎都是糟糕的解决方案.

All options (using betas, Monte Carlo and "Leave-one-out") seem like poor solutions to me.

  1. 蒙特卡洛(Monte Carlo)取决于训练集的分布,我找不到任何文献来支持它.
  2. 遗漏"很容易被两个相关的特征所欺骗(当一个不存在时,另一个将介入以进行补偿,并且两个都将被赋予0重要性).
  3. 调整后的beta听起来很合理,但是我找不到任何文献来支持它.

实际问题:在做出决策时,使用线性分类器来解释每个功能的重要性的最佳方法是什么?

Actual question: What is the best way to interpret the importance of each feature, at the moment of a decision, with a linear classifier?

快速注释#1:对于随机森林这是微不足道的,我们可以简单地使用prediction + bias分解,如

Quick note #1: for Random Forests this is trivial, we can simply use the prediction + bias decomposition, as explained beautifully in this blog post. The problem here is how to do something similar with linear classifiers such as Logistic Regression.

快速注释#2:关于stackoverflow有很多相关问题( 2 3 5 ).我无法找到这个特定问题的答案.

Quick note #2: there are a number of related questions on stackoverflow (1 2 3 4 5). I have not been able to find an answer to this specific question.

推荐答案

如果您希望特定决策的功能很重要,为什么不模拟decision_function(scikit-learn提供了该功能,因此您可以进行测试是否获得相同的值)?线性分类器的决策函数很简单:

If you want the importance of the features for a particular decision, why not simulate the decision_function (Which is provided by scikit-learn, so you can test whether you get the same value) step by step? The decision function for linear classifiers is simply:

intercept_ + coef_[0]*feature[0] + coef_[1]*feature[1] + ...

特征 i 的重要性只是coef_[i]*feature[i].当然,这类似于查看系数的幅值,但是由于它乘以实际特征,这也是引擎盖下发生的事情,所以这可能是您最好的选择.

The importance of a feature i is then just coef_[i]*feature[i]. Of course this is similar to looking at the magnitude of the coefficients, but since it is multiplied with the actual feature and it is also what happens under the hood it might be your best bet.

这篇关于如何获得特定预测的逻辑回归特征的相对重要性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆