eli5:具有两个标签的show_weights() [英] eli5: show_weights() with two labels

查看：986 发布时间：2020/5/18 0:58:38 scikit-learn nlp regression

本文介绍了eli5:具有两个标签的show_weights()的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在按顺序尝试 eli5 了解术语对某些类别的预测的贡献.

I'm trying eli5 in order to understand the contribution of terms to the prediction of certain classes.

您可以运行以下脚本:

import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import Pipeline
from sklearn.datasets import fetch_20newsgroups

#categories = ['alt.atheism', 'soc.religion.christian']
categories = ['alt.atheism', 'soc.religion.christian', 'comp.graphics']

np.random.seed(1)
train = fetch_20newsgroups(subset='train', categories=categories, shuffle=True, random_state=7)
test = fetch_20newsgroups(subset='test', categories=categories, shuffle=True, random_state=7)

bow_model = CountVectorizer(stop_words='english')
clf = LogisticRegression()
pipel = Pipeline([('bow', bow),
                 ('classifier', clf)])

pipel.fit(train.data, train.target)

import eli5
eli5.show_weights(clf, vec=bow, top=20)

问题:

不幸的是，当使用两个标签时，输出仅限于一个表:

When working with two labels, the output is unfortunately limited to only one table:

categories = ['alt.atheism', 'soc.religion.christian']

但是，当使用三个标签时，它也会输出三个表.

However, when using three labels, it also outputs three tables.

categories = ['alt.atheism', 'soc.religion.christian', 'comp.graphics']

是软件中的错误，它在第一个输出中错过了y = 0还是我错过了一个统计点?我希望在第一种情况下可以看到两个表格.

Is it a bug in the software that it misses y=0 in the first output, or do I miss a statistical point? I would expect to see two tables for the first case.

推荐答案

这与eli5无关，而是与scikit-learn(在本例中为LogisticRegression())如何对待两个类别有关.对于只有两个类别，问题变成了二进制类别，因此从学习到的分类器中到处都只返回一列属性.

This has not to do with eli5 but with how scikit-learn (in this case LogisticRegression()) treats two categories. For only two categories, the problem turns into a binary one, so only a single column of attributes is returned everywhere from learned classifier.

查看LogisticRegression的属性:

Look at the attributes of LogisticRegression:

coef_:数组，形状为(1，n_features)或(n_classes，n_features)

coef_ : array, shape (1, n_features) or (n_classes, n_features)

Coefficient of the features in the decision function.
coef_ is of shape (1, n_features) when the given problem is binary.

intercept_:数组，形状为(1，)或(n_classes，)

intercept_ : array, shape (1,) or (n_classes,)

Intercept (a.k.a. bias) added to the decision function.

If fit_intercept is set to False, the intercept is set to zero.
intercept_ is of shape(1,) when the problem is binary.

coef_为二进制时的形状为(1, n_features). coef_由eli5.show_weights()使用.

coef_ is of shape (1, n_features) when binary. This coef_ is used by the eli5.show_weights().

希望这很清楚.

这篇关于eli5:具有两个标签的show_weights()的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

eli5:具有两个标签的show_weights() [英] eli5: show_weights() with two labels

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

eli5:具有两个标签的show_weights() [英] eli5: show_weights() with two labels

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭