scikit-学习LogisticRegression.predict_proba的返回值 [英] scikit-learn return value of LogisticRegression.predict_proba

查看:627
本文介绍了scikit-学习LogisticRegression.predict_proba的返回值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

LogisticRegression.predict_proba函数究竟返回什么?

在我的示例中,我得到这样的结果:

In my example I get a result like this:

[[  4.65761066e-03   9.95342389e-01]
 [  9.75851270e-01   2.41487300e-02]
 [  9.99983374e-01   1.66258341e-05]]

从其他计算中,我知道使用Sigmoid函数的第二列是概率. 文档说,第一列是n_samples,但是不能,因为我的示例是评论,是文本而不是数字.该文档还说,第二列是n_classes.那肯定不是,因为我只有两个类(即+1-1),并且该函数应该用于计算实际上属于某个类的样本的概率,而不是用于计算这些类本身的概率.

From other calculations, using the sigmoid function, I know, that the second column are probabilities. The documentation says, that the first column are n_samples, but that can't be, because my samples are reviews, which are texts and not numbers. The documentation also says, that the second column are n_classes. That certainly can't be, since I only have two classes (namely +1 and -1) and the function is supposed to be about calculating probabilities of samples really being of a class, but not the classes themselves.

第一列到底是什么?为什么在那儿?

What is the first column really and why it is there?

推荐答案

4.65761066e-03 + 9.95342389e-01 = 1
9.75851270e-01 + 2.41487300e-02 = 1
9.99983374e-01 + 1.66258341e-05 = 1

第一列是条目具有-1标签的概率,第二列是条目具有+1标签的概率.

The first column is the probability that the entry has the -1 label and the second column is the probability that the entry has the +1 label.

如果仅想获取阳性标签的预测概率,则可以使用logistic_model.predict_proba(data)[:,1].这将为您提供[9.95342389e-01, 2.41487300e-02, 1.66258341e-05]结果.

If you would like to get the predicted probabilities for the positive label only, you can use logistic_model.predict_proba(data)[:,1]. This will yield you the [9.95342389e-01, 2.41487300e-02, 1.66258341e-05] result.

这篇关于scikit-学习LogisticRegression.predict_proba的返回值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆