绘制scikit-learn(sklearn)SVM决策边界/曲面 [英] Plot scikit-learn (sklearn) SVM decision boundary / surface

查看:257
本文介绍了绘制scikit-learn(sklearn)SVM决策边界/曲面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用python的scikit库执行带有线性内核的多类SVM。
样本训练数据和测试数据如下:



模型数据:

  x = [[20,32,45,33,32,44,0],[23,32,45,12,32,66,11],[16,32,45, 12,32,44,23],[120,2,55,62,82,14,81],[30,222,115,12,42,64,91],[220,12,55,222,82,14,181],[ 30,222,315,12,222,64,111]] 
y = [0,0,0,1,1,2,2]

我想绘制决策边界并使数据集可视化。



上面给出的数据只是模拟数据,请随时更改值。
至少如果您可以建议要执行的步骤,这将很有帮助。
预先感谢

解决方案

您只需选择2个功能即可。原因是您无法绘制7D图。选择这两个功能后,仅将这些功能用于决策面的可视化。



(我还在此处撰写了一篇文章:






编辑:应用PCA以减少尺寸。

 来自sklearn.svm导入SVC 
导入numpy为np
从sklearn导入svm导入为plt
的matplotlib.pyplot,从sklearn.decomposition导入数据集
导入PCA

iris = datasets.load_iris()

X = iris.data
y = iris.target

pca = PCA(n_components = 2)
Xreduced = pca.fit_transform(X)

def make_meshgrid(x,y,h = .02):
x_min,x_max = x.min()-1,x.max( )+ 1
y_min,y_max = y.min()-1,y.max()+ 1
xx,yy = np.meshgrid(np.arange(x_min,x_max,h),np .arange(y_min,y_max,h))
返回xx,yy

def plot_contours(ax,clf,xx,yy,** params):
Z = clf。预测(np.c_ [xx.ravel(),yy.ravel()])
Z = Z.reshape(xx.shape)
out = ax.contourf(xx,yy,Z,* *参数)
返回

model = svm.SVC(内核='线性')
clf = model.fit(Xreduced,y)

图,ax = plt.subplots()
#绘图标题
title =('线性SVC的决定面')
#绘图的设置网格。
X0,X1 = Xreduced [:, 0],Xreduced [:, 1]
xx,yy = make_meshgrid(X0,X1)

plot_contours(ax,clf,xx ,yy,cmap = plt.cm.coolwarm,alpha = 0.8)
ax.scatter(X0,X1,c = y,cmap = plt.cm.coolwarm,s = 20,edgecolors ='k')
ax.set_ylabel('PC2')
ax.set_xlabel('PC1')
ax.set_xticks(())
ax.set_yticks(())
ax .set_title('使用PCA转换/投影特征的Decison曲面')
ax.legend()
plt.show()






编辑1(2020年4月15日):



案例:3D绘图,包含3个特征并使用了虹膜数据集



  from sklearn.svm import SVC 
numpy as np
import matplotlib.pyplot as plt $ b来自sklearn导入svm的$ b,来自mpl_toolkits.mplot3d的数据集
导入Axes3D

iris =数据集.load_iris()
X = iris.data [:,:3]#我们仅采用前三个特征。
Y = iris.target

#使其成为二进制分类问题
X = X [np.logical_or(Y == 0,Y == 1)]
Y = Y [np.logical_or(Y == 0,Y == 1)]

模型= svm.SVC(内核='线性')
clf = model.fit(X ,Y)

#分离平面的方程式由所有x给出,因此np.dot(svc.coef_ [0],x)+ b =0。
# w3(z)
z =λx,y:(-clf.intercept_ [0] -clf.coef_ [0] [0] * x -clf.coef_ [0] [1] * y)/ clf。 coef_ [0] [2]

tmp = np.linspace(-5,5,30)
x,y = np.meshgrid(tmp,tmp)

图= plt.figure()
轴= fig.add_subplot(111,投影='3d')
ax.plot3D(X [Y == 0,0],X [Y == 0 ,1],X [Y == 0,2],'ob')
ax.plot3D(X [Y == 1,0],X [Y == 1,1],X [Y = = 1,2],'sr')
ax.plot_surface(x,y,z(x,y))
ax.view_init(30,60)
plt.show()


I am currently performing multi class SVM with linear kernel using python's scikit library. The sample training data and testing data are as given below:

Model data:

x = [[20,32,45,33,32,44,0],[23,32,45,12,32,66,11],[16,32,45,12,32,44,23],[120,2,55,62,82,14,81],[30,222,115,12,42,64,91],[220,12,55,222,82,14,181],[30,222,315,12,222,64,111]]
y = [0,0,0,1,1,2,2]

I want to plot the decision boundary and visualize the datasets. Can someone please help to plot this type of data.

The data given above is just mock data so feel free to change the values. It would be helpful if at least if you could suggest the steps that are to followed. Thanks in advance

解决方案

You have to choose only 2 features to do this. The reason is that you cannot plot a 7D plot. After selecting the 2 features use only these for the visualization of the decision surface.

(I have also written an article about this here: https://towardsdatascience.com/support-vector-machines-svm-clearly-explained-a-python-tutorial-for-classification-problems-29c539f3ad8?source=friends_link&sk=80f72ab272550d76a0cc3730d7c8af35)


Now, the next question that you would ask: How can I choose these 2 features?. Well, there are a lot of ways. You could do a univariate F-value (feature ranking) test and see what features/variables are the most important. Then you could use these for the plot. Also, we could reduce the dimensionality from 7 to 2 using PCA for example.


2D plot for 2 features and using the iris dataset

from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets

iris = datasets.load_iris()
# Select 2 features / variable for the 2D plot that we are going to create.
X = iris.data[:, :2]  # we only take the first two features.
y = iris.target

def make_meshgrid(x, y, h=.02):
    x_min, x_max = x.min() - 1, x.max() + 1
    y_min, y_max = y.min() - 1, y.max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    return xx, yy

def plot_contours(ax, clf, xx, yy, **params):
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    out = ax.contourf(xx, yy, Z, **params)
    return out

model = svm.SVC(kernel='linear')
clf = model.fit(X, y)

fig, ax = plt.subplots()
# title for the plots
title = ('Decision surface of linear SVC ')
# Set-up grid for plotting.
X0, X1 = X[:, 0], X[:, 1]
xx, yy = make_meshgrid(X0, X1)

plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')
ax.set_ylabel('y label here')
ax.set_xlabel('x label here')
ax.set_xticks(())
ax.set_yticks(())
ax.set_title(title)
ax.legend()
plt.show()


EDIT: Apply PCA to reduce dimensionality.

from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.decomposition import PCA

iris = datasets.load_iris()

X = iris.data  
y = iris.target

pca = PCA(n_components=2)
Xreduced = pca.fit_transform(X)

def make_meshgrid(x, y, h=.02):
    x_min, x_max = x.min() - 1, x.max() + 1
    y_min, y_max = y.min() - 1, y.max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    return xx, yy

def plot_contours(ax, clf, xx, yy, **params):
    Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    out = ax.contourf(xx, yy, Z, **params)
    return out

model = svm.SVC(kernel='linear')
clf = model.fit(Xreduced, y)

fig, ax = plt.subplots()
# title for the plots
title = ('Decision surface of linear SVC ')
# Set-up grid for plotting.
X0, X1 = Xreduced[:, 0], Xreduced[:, 1]
xx, yy = make_meshgrid(X0, X1)

plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')
ax.set_ylabel('PC2')
ax.set_xlabel('PC1')
ax.set_xticks(())
ax.set_yticks(())
ax.set_title('Decison surface using the PCA transformed/projected features')
ax.legend()
plt.show()


EDIT 1 (April 15th, 2020):

Case: 3D plot for 3 features and using the iris dataset

from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from mpl_toolkits.mplot3d import Axes3D

iris = datasets.load_iris()
X = iris.data[:, :3]  # we only take the first three features.
Y = iris.target

#make it binary classification problem
X = X[np.logical_or(Y==0,Y==1)]
Y = Y[np.logical_or(Y==0,Y==1)]

model = svm.SVC(kernel='linear')
clf = model.fit(X, Y)

# The equation of the separating plane is given by all x so that np.dot(svc.coef_[0], x) + b = 0.
# Solve for w3 (z)
z = lambda x,y: (-clf.intercept_[0]-clf.coef_[0][0]*x -clf.coef_[0][1]*y) / clf.coef_[0][2]

tmp = np.linspace(-5,5,30)
x,y = np.meshgrid(tmp,tmp)

fig = plt.figure()
ax  = fig.add_subplot(111, projection='3d')
ax.plot3D(X[Y==0,0], X[Y==0,1], X[Y==0,2],'ob')
ax.plot3D(X[Y==1,0], X[Y==1,1], X[Y==1,2],'sr')
ax.plot_surface(x, y, z(x,y))
ax.view_init(30, 60)
plt.show()

这篇关于绘制scikit-learn(sklearn)SVM决策边界/曲面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆