绘制 scikit-learn (sklearn) SVM 决策边界/曲面 [英] Plot scikit-learn (sklearn) SVM decision boundary / surface

查看:190
本文介绍了绘制 scikit-learn (sklearn) SVM 决策边界/曲面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用 python 的 scikit 库使用线性内核执行多类 SVM.样本训练数据和测试数据如下:

模型数据:

x = [[20,32,45,33,32,44,0],[23,32,45,12,32,66,11],[16,32,45,12,32,44,23],[120,2,55,62,82,14,81],[30,222,115,12,42,64,91],[220,12,55,222,82,14,181],[30,222,315,12,222,64,111]]y = [0,0,0,1,1,2,2]

我想绘制决策边界并可视化数据集.有人可以帮忙绘制这种类型的数据.

上面给出的数据只是模拟数据,所以可以随意更改值.如果您至少可以建议要遵循的步骤,那将会很有帮助.提前致谢

解决方案

您只需选择 2 个功能即可执行此操作.原因是您无法绘制 7D 图.选择 2 个特征后,仅将这些用于决策面的可视化.

(我也在这里写了一篇关于这个的文章:

<小时>

应用 PCA 来降低维度.

from sklearn.svm import SVC将 numpy 导入为 np导入 matplotlib.pyplot 作为 plt从 sklearn 导入支持向量机,数据集从 sklearn.decomposition 导入 PCA虹膜 = datasets.load_iris()X = 虹膜数据y = iris.targetpca = PCA(n_components=2)Xreduced = pca.fit_transform(X)def make_meshgrid(x, y, h=.02):x_min, x_max = x.min() - 1, x.max() + 1y_min, y_max = y.min() - 1, y.max() + 1xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))返回 xx, yydef plot_contours(ax, clf, xx, yy, **params):Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])Z = Z.reshape(xx.shape)out = ax.contourf(xx, yy, Z, **params)回来模型 = svm.SVC(kernel='线性')clf = model.fit(Xreduced, y)图, ax = plt.subplots()# 情节的标题title = ('线性SVC的决策面')# 为绘图设置网格.X0, X1 = Xreduced[:, 0], Xreduced[:, 1]xx, yy = make_meshgrid(X0, X1)plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')ax.set_ylabel('PC2')ax.set_xlabel('PC1')ax.set_xticks(())ax.set_yticks(())ax.set_title('使用 PCA 变换/投影特征的决策表面')ax.legend()plt.show()

<小时>

编辑 1(2020 年 4 月 15 日):

案例:3 个特征的 3D 绘图并使用 iris 数据集

from sklearn.svm import SVC将 numpy 导入为 np导入 matplotlib.pyplot 作为 plt从 sklearn 导入支持向量机,数据集从 mpl_toolkits.mplot3d 导入 Axes3D虹膜 = datasets.load_iris()X = iris.data[:, :3] # 我们只取前三个特征.Y = 虹膜.目标#使它成为二分类问题X = X[np.logical_or(Y==0,Y==1)]Y = Y[np.logical_or(Y==0,Y==1)]模型 = svm.SVC(kernel='线性')clf = model.fit(X, Y)# 分离平面的方程由所有 x 给出,因此 np.dot(svc.coef_[0], x) + b = 0.# 求解 w3 (z)z = lambda x,y: (-clf.intercept_[0]-clf.coef_[0][0]*x -clf.coef_[0][1]*y)/clf.coef_[0][2]tmp = np.linspace(-5,5,30)x,y = np.meshgrid(tmp,tmp)fig = plt.figure()ax = fig.add_subplot(111, 投影='3d')ax.plot3D(X[Y==0,0], X[Y==0,1], X[Y==0,2],'ob')ax.plot3D(X[Y==1,0], X[Y==1,1], X[Y==1,2],'sr')ax.plot_surface(x, y, z(x,y))ax.view_init(30, 60)plt.show()

I am currently performing multi class SVM with linear kernel using python's scikit library. The sample training data and testing data are as given below:

Model data:

x = [[20,32,45,33,32,44,0],[23,32,45,12,32,66,11],[16,32,45,12,32,44,23],[120,2,55,62,82,14,81],[30,222,115,12,42,64,91],[220,12,55,222,82,14,181],[30,222,315,12,222,64,111]]
y = [0,0,0,1,1,2,2]

I want to plot the decision boundary and visualize the datasets. Can someone please help to plot this type of data.

The data given above is just mock data so feel free to change the values. It would be helpful if at least if you could suggest the steps that are to followed. Thanks in advance

解决方案

You have to choose only 2 features to do this. The reason is that you cannot plot a 7D plot. After selecting the 2 features use only these for the visualization of the decision surface.

(I have also written an article about this here: https://towardsdatascience.com/support-vector-machines-svm-clearly-explained-a-python-tutorial-for-classification-problems-29c539f3ad8?source=friends_link&sk=80f72ab272550d76a0cc3730d7c8af35)


Now, the next question that you would ask: How can I choose these 2 features?. Well, there are a lot of ways. You could do a univariate F-value (feature ranking) test and see what features/variables are the most important. Then you could use these for the plot. Also, we could reduce the dimensionality from 7 to 2 using PCA for example.


2D plot for 2 features and using the iris dataset

from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets

iris = datasets.load_iris()
# Select 2 features / variable for the 2D plot that we are going to create.
X = iris.data[:, :2]  # we only take the first two features.
y = iris.target

def make_meshgrid(x, y, h=.02):
    x_min, x_max = x.min() - 1, x.max() + 1
    y_min, y_max = y.min() - 1, y.max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    return xx, yy

def plot_contours(ax, clf, xx, yy, **params):
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    out = ax.contourf(xx, yy, Z, **params)
    return out

model = svm.SVC(kernel='linear')
clf = model.fit(X, y)

fig, ax = plt.subplots()
# title for the plots
title = ('Decision surface of linear SVC ')
# Set-up grid for plotting.
X0, X1 = X[:, 0], X[:, 1]
xx, yy = make_meshgrid(X0, X1)

plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')
ax.set_ylabel('y label here')
ax.set_xlabel('x label here')
ax.set_xticks(())
ax.set_yticks(())
ax.set_title(title)
ax.legend()
plt.show()


EDIT: Apply PCA to reduce dimensionality.

from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.decomposition import PCA

iris = datasets.load_iris()

X = iris.data  
y = iris.target

pca = PCA(n_components=2)
Xreduced = pca.fit_transform(X)

def make_meshgrid(x, y, h=.02):
    x_min, x_max = x.min() - 1, x.max() + 1
    y_min, y_max = y.min() - 1, y.max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    return xx, yy

def plot_contours(ax, clf, xx, yy, **params):
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    out = ax.contourf(xx, yy, Z, **params)
    return out

model = svm.SVC(kernel='linear')
clf = model.fit(Xreduced, y)

fig, ax = plt.subplots()
# title for the plots
title = ('Decision surface of linear SVC ')
# Set-up grid for plotting.
X0, X1 = Xreduced[:, 0], Xreduced[:, 1]
xx, yy = make_meshgrid(X0, X1)

plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')
ax.set_ylabel('PC2')
ax.set_xlabel('PC1')
ax.set_xticks(())
ax.set_yticks(())
ax.set_title('Decison surface using the PCA transformed/projected features')
ax.legend()
plt.show()


EDIT 1 (April 15th, 2020):

Case: 3D plot for 3 features and using the iris dataset

from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from mpl_toolkits.mplot3d import Axes3D

iris = datasets.load_iris()
X = iris.data[:, :3]  # we only take the first three features.
Y = iris.target

#make it binary classification problem
X = X[np.logical_or(Y==0,Y==1)]
Y = Y[np.logical_or(Y==0,Y==1)]

model = svm.SVC(kernel='linear')
clf = model.fit(X, Y)

# The equation of the separating plane is given by all x so that np.dot(svc.coef_[0], x) + b = 0.
# Solve for w3 (z)
z = lambda x,y: (-clf.intercept_[0]-clf.coef_[0][0]*x -clf.coef_[0][1]*y) / clf.coef_[0][2]

tmp = np.linspace(-5,5,30)
x,y = np.meshgrid(tmp,tmp)

fig = plt.figure()
ax  = fig.add_subplot(111, projection='3d')
ax.plot3D(X[Y==0,0], X[Y==0,1], X[Y==0,2],'ob')
ax.plot3D(X[Y==1,0], X[Y==1,1], X[Y==1,2],'sr')
ax.plot_surface(x, y, z(x,y))
ax.view_init(30, 60)
plt.show()

这篇关于绘制 scikit-learn (sklearn) SVM 决策边界/曲面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆