在逻辑回归中以最大可能概率预测某些标签 [英] Predict certain label with highest possible probability in logistic regression

查看:117
本文介绍了在逻辑回归中以最大可能概率预测某些标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用sklearn中的逻辑回归构建具有12个参数和{0,1}标签的模型.我需要对标签0充满信心,如果某些"0"会被错误分类为1,我可以.目的是,如果数据被分类为0,我想从处理中排除数据.

如何调整参数?

解决方案

您基本上是在寻找特异性,它被定义为TN/(TN+FP),其中TN为True阴性,FP为False Positive.您可以在此博客文章中了解更多信息,并在详细信息.要实现此目的,您需要使用 make_scorer 以及sklearn中的 confusion_matrix指标:

from sklearn.metrics import confusion_matrix
from sklearn.metrics import make_scorer

def get_TN_rate(y_true,y_pred):
    tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
    specificity = float(tn)/(float(tn)+float(fp))
    return specificity

tn_rate = make_scorer(get_TN_rate,greater_is_better=True)

现在,您可以将tn_rate用作评分功能来训练您的分类器.

I am building the model, having 12 parameters and {0,1} labels using logistic regression in sklearn. I need to be very confident about label 0, I am ok if some '0' will be missclassified to 1. The purpose of this, that I would like to exclude the data from the processing if the data is classifies to 0.

How can I tune the parameters?

解决方案

You are basically looking for specificity, which is defined as the TN/(TN+FP), where TN is True Negative and FP is False Positive. You can read more about this in this blog post and more in detail here. To implement this you need to use make_scorer along with confusion_matrix metric in sklearn as follows :

from sklearn.metrics import confusion_matrix
from sklearn.metrics import make_scorer

def get_TN_rate(y_true,y_pred):
    tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
    specificity = float(tn)/(float(tn)+float(fp))
    return specificity

tn_rate = make_scorer(get_TN_rate,greater_is_better=True)

Now you can use tn_rate as a scoring function to train your classifier.

这篇关于在逻辑回归中以最大可能概率预测某些标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆