绘制逻辑回归的决策边界 [英] plotting decision boundary of logistic regression

查看:665
本文介绍了绘制逻辑回归的决策边界的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在实施逻辑回归.我设法弄清了概率,并且能够预测2类分类任务.

I'm implementing logistic regression. I managed to get probabilities out of it, and am able to predict a 2 class classification task.

我的问题是:

对于我的最终模型,我有权重和训练数据.有2个特征,所以我的权重是2行的向量.

For my final model, I have weights and the training data. There are 2 features, so my weight is a vector with 2 rows.

我该如何绘制?我看到了这篇文章 ,但我不太明白答案.我需要等高线图吗?

How do I plot this? I saw this post, but I don't quite understand the answer. Do I need a contour plot?

推荐答案

逻辑回归分类器的一个优点是,一旦拟合出该分类器,就可以获取任何样本矢量的概率.这可能更有趣.这是使用scikit-learn的示例:

An advantage of the logistic regression classifier is that once you fit it, you can get probabilities for any sample vector. That may be more interesting to plot. Here's an example using scikit-learn:

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="white")

首先,生成数据并使分类器适合训练集:

First, generate the data and fit the classifier to the training set:

X, y = make_classification(200, 2, 2, 0, weights=[.5, .5], random_state=15)
clf = LogisticRegression().fit(X[:100], y[:100])

接下来,建立一个连续的值网格,并评估网格中每个(x,y)点的概率:

Next, make a continuous grid of values and evaluate the probability of each (x, y) point in the grid:

xx, yy = np.mgrid[-5:5:.01, -5:5:.01]
grid = np.c_[xx.ravel(), yy.ravel()]
probs = clf.predict_proba(grid)[:, 1].reshape(xx.shape)

现在,将概率网格绘制为等高线图,并在其上方显示测试集样本:

Now, plot the probability grid as a contour map and additionally show the test set samples on top of it:

f, ax = plt.subplots(figsize=(8, 6))
contour = ax.contourf(xx, yy, probs, 25, cmap="RdBu",
                      vmin=0, vmax=1)
ax_c = f.colorbar(contour)
ax_c.set_label("$P(y = 1)$")
ax_c.set_ticks([0, .25, .5, .75, 1])

ax.scatter(X[100:,0], X[100:, 1], c=y[100:], s=50,
           cmap="RdBu", vmin=-.2, vmax=1.2,
           edgecolor="white", linewidth=1)

ax.set(aspect="equal",
       xlim=(-5, 5), ylim=(-5, 5),
       xlabel="$X_1$", ylabel="$X_2$")

通过逻辑回归,您可以根据所需的任何阈值对新样本进行分类,因此,它固有地没有一个决策边界".但是,当然,要使用的常见决策规则是p = .5.我们也可以使用上面的代码绘制轮廓线级别:

The logistic regression lets your classify new samples based on any threshold you want, so it doesn't inherently have one "decision boundary." But, of course, a common decision rule to use is p = .5. We can also just draw that contour level using the above code:

f, ax = plt.subplots(figsize=(8, 6))
ax.contour(xx, yy, probs, levels=[.5], cmap="Greys", vmin=0, vmax=.6)

ax.scatter(X[100:,0], X[100:, 1], c=y[100:], s=50,
           cmap="RdBu", vmin=-.2, vmax=1.2,
           edgecolor="white", linewidth=1)

ax.set(aspect="equal",
       xlim=(-5, 5), ylim=(-5, 5),
       xlabel="$X_1$", ylabel="$X_2$")

这篇关于绘制逻辑回归的决策边界的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆