绘制scikit-learn(sklearn)SVM决策边界/曲面 [英] Plot scikit-learn (sklearn) SVM decision boundary / surface
问题描述
我目前正在使用python的scikit库执行带有线性内核的多类SVM。
样本训练数据和测试数据如下:
模型数据:
x = [[20,32,45,33,32,44,0],[23,32,45,12,32,66,11],[16,32,45, 12,32,44,23],[120,2,55,62,82,14,81],[30,222,115,12,42,64,91],[220,12,55,222,82,14,181],[ 30,222,315,12,222,64,111]]
y = [0,0,0,1,1,2,2]
我想绘制决策边界并使数据集可视化。
上面给出的数据只是模拟数据,请随时更改值。
至少如果您可以建议要执行的步骤,这将很有帮助。
预先感谢
您只需选择2个功能即可。原因是您无法绘制7D图。选择这两个功能后,仅将这些功能用于决策面的可视化。
(我还在此处撰写了一篇文章:
编辑:应用PCA以减少尺寸。
来自sklearn.svm导入SVC
导入numpy为np
从sklearn导入svm导入为plt
的matplotlib.pyplot,从sklearn.decomposition导入数据集
导入PCA
iris = datasets.load_iris()
X = iris.data
y = iris.target
pca = PCA(n_components = 2)
Xreduced = pca.fit_transform(X)
def make_meshgrid(x,y,h = .02):
x_min,x_max = x.min()-1,x.max( )+ 1
y_min,y_max = y.min()-1,y.max()+ 1
xx,yy = np.meshgrid(np.arange(x_min,x_max,h),np .arange(y_min,y_max,h))
返回xx,yy
def plot_contours(ax,clf,xx,yy,** params):
Z = clf。预测(np.c_ [xx.ravel(),yy.ravel()])
Z = Z.reshape(xx.shape)
out = ax.contourf(xx,yy,Z,* *参数)
返回
model = svm.SVC(内核='线性')
clf = model.fit(Xreduced,y)
图,ax = plt.subplots()
#绘图标题
title =('线性SVC的决定面')
#绘图的设置网格。
X0,X1 = Xreduced [:, 0],Xreduced [:, 1]
xx,yy = make_meshgrid(X0,X1)
plot_contours(ax,clf,xx ,yy,cmap = plt.cm.coolwarm,alpha = 0.8)
ax.scatter(X0,X1,c = y,cmap = plt.cm.coolwarm,s = 20,edgecolors ='k')
ax.set_ylabel('PC2')
ax.set_xlabel('PC1')
ax.set_xticks(())
ax.set_yticks(())
ax .set_title('使用PCA转换/投影特征的Decison曲面')
ax.legend()
plt.show()
编辑1(2020年4月15日):
案例:3D绘图,包含3个特征并使用了虹膜数据集
from sklearn.svm import SVC
numpy as np
import matplotlib.pyplot as plt $ b来自sklearn导入svm的$ b,来自mpl_toolkits.mplot3d的数据集
导入Axes3D
iris =数据集.load_iris()
X = iris.data [:,:3]#我们仅采用前三个特征。
Y = iris.target
#使其成为二进制分类问题
X = X [np.logical_or(Y == 0,Y == 1)]
Y = Y [np.logical_or(Y == 0,Y == 1)]
模型= svm.SVC(内核='线性')
clf = model.fit(X ,Y)
#分离平面的方程式由所有x给出,因此np.dot(svc.coef_ [0],x)+ b =0。
# w3(z)
z =λx,y:(-clf.intercept_ [0] -clf.coef_ [0] [0] * x -clf.coef_ [0] [1] * y)/ clf。 coef_ [0] [2]
tmp = np.linspace(-5,5,30)
x,y = np.meshgrid(tmp,tmp)
图= plt.figure()
轴= fig.add_subplot(111,投影='3d')
ax.plot3D(X [Y == 0,0],X [Y == 0 ,1],X [Y == 0,2],'ob')
ax.plot3D(X [Y == 1,0],X [Y == 1,1],X [Y = = 1,2],'sr')
ax.plot_surface(x,y,z(x,y))
ax.view_init(30,60)
plt.show()
I am currently performing multi class SVM with linear kernel using python's scikit library. The sample training data and testing data are as given below:
Model data:
x = [[20,32,45,33,32,44,0],[23,32,45,12,32,66,11],[16,32,45,12,32,44,23],[120,2,55,62,82,14,81],[30,222,115,12,42,64,91],[220,12,55,222,82,14,181],[30,222,315,12,222,64,111]]
y = [0,0,0,1,1,2,2]
I want to plot the decision boundary and visualize the datasets. Can someone please help to plot this type of data.
The data given above is just mock data so feel free to change the values. It would be helpful if at least if you could suggest the steps that are to followed. Thanks in advance
You have to choose only 2 features to do this. The reason is that you cannot plot a 7D plot. After selecting the 2 features use only these for the visualization of the decision surface.
(I have also written an article about this here: https://towardsdatascience.com/support-vector-machines-svm-clearly-explained-a-python-tutorial-for-classification-problems-29c539f3ad8?source=friends_link&sk=80f72ab272550d76a0cc3730d7c8af35)
Now, the next question that you would ask: How can I choose these 2 features?. Well, there are a lot of ways. You could do a univariate F-value (feature ranking) test and see what features/variables are the most important. Then you could use these for the plot. Also, we could reduce the dimensionality from 7 to 2 using PCA for example.
2D plot for 2 features and using the iris dataset
from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
iris = datasets.load_iris()
# Select 2 features / variable for the 2D plot that we are going to create.
X = iris.data[:, :2] # we only take the first two features.
y = iris.target
def make_meshgrid(x, y, h=.02):
x_min, x_max = x.min() - 1, x.max() + 1
y_min, y_max = y.min() - 1, y.max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
return xx, yy
def plot_contours(ax, clf, xx, yy, **params):
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
out = ax.contourf(xx, yy, Z, **params)
return out
model = svm.SVC(kernel='linear')
clf = model.fit(X, y)
fig, ax = plt.subplots()
# title for the plots
title = ('Decision surface of linear SVC ')
# Set-up grid for plotting.
X0, X1 = X[:, 0], X[:, 1]
xx, yy = make_meshgrid(X0, X1)
plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')
ax.set_ylabel('y label here')
ax.set_xlabel('x label here')
ax.set_xticks(())
ax.set_yticks(())
ax.set_title(title)
ax.legend()
plt.show()
EDIT: Apply PCA to reduce dimensionality.
from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.decomposition import PCA
iris = datasets.load_iris()
X = iris.data
y = iris.target
pca = PCA(n_components=2)
Xreduced = pca.fit_transform(X)
def make_meshgrid(x, y, h=.02):
x_min, x_max = x.min() - 1, x.max() + 1
y_min, y_max = y.min() - 1, y.max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
return xx, yy
def plot_contours(ax, clf, xx, yy, **params):
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)
out = ax.contourf(xx, yy, Z, **params)
return out
model = svm.SVC(kernel='linear')
clf = model.fit(Xreduced, y)
fig, ax = plt.subplots()
# title for the plots
title = ('Decision surface of linear SVC ')
# Set-up grid for plotting.
X0, X1 = Xreduced[:, 0], Xreduced[:, 1]
xx, yy = make_meshgrid(X0, X1)
plot_contours(ax, clf, xx, yy, cmap=plt.cm.coolwarm, alpha=0.8)
ax.scatter(X0, X1, c=y, cmap=plt.cm.coolwarm, s=20, edgecolors='k')
ax.set_ylabel('PC2')
ax.set_xlabel('PC1')
ax.set_xticks(())
ax.set_yticks(())
ax.set_title('Decison surface using the PCA transformed/projected features')
ax.legend()
plt.show()
EDIT 1 (April 15th, 2020):
Case: 3D plot for 3 features and using the iris dataset
from sklearn.svm import SVC
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm, datasets
from mpl_toolkits.mplot3d import Axes3D
iris = datasets.load_iris()
X = iris.data[:, :3] # we only take the first three features.
Y = iris.target
#make it binary classification problem
X = X[np.logical_or(Y==0,Y==1)]
Y = Y[np.logical_or(Y==0,Y==1)]
model = svm.SVC(kernel='linear')
clf = model.fit(X, Y)
# The equation of the separating plane is given by all x so that np.dot(svc.coef_[0], x) + b = 0.
# Solve for w3 (z)
z = lambda x,y: (-clf.intercept_[0]-clf.coef_[0][0]*x -clf.coef_[0][1]*y) / clf.coef_[0][2]
tmp = np.linspace(-5,5,30)
x,y = np.meshgrid(tmp,tmp)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot3D(X[Y==0,0], X[Y==0,1], X[Y==0,2],'ob')
ax.plot3D(X[Y==1,0], X[Y==1,1], X[Y==1,2],'sr')
ax.plot_surface(x, y, z(x,y))
ax.view_init(30, 60)
plt.show()
这篇关于绘制scikit-learn(sklearn)SVM决策边界/曲面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!