无法理解SVM和LR中的决策边界图 [英] Cannot understand plotting of decision boundary in SVM and LR
问题描述
例如,我们有f(x)= x.怎么画呢?我们取一些x,然后计算y,然后再次执行此操作,然后按点绘制图表.简单明了.
For example we have f(x) = x. How to plot it? We take some x then calculate y and doing this operation again, then plot chart by dots. Simple and clear.
但是我不能很清楚地理解绘制决策边界-当我们没有y绘制时,只有x.
But I cannot understand so clearly plotting decision boundary - when we haven't y to plot, only x.
用于SVM的Python代码:
Python code for SVM:
h = .02 # step size in the mesh
Y = y
# we create an instance of SVM and fit out data. We do not scale our
# data since we want to plot the support vectors
C = 1.0 # SVM regularization parameter
svc = svm.SVC(kernel='linear', C=C).fit(X, Y)
rbf_svc = svm.SVC(kernel='rbf', gamma=0.7, C=C).fit(X, Y)
poly_svc = svm.SVC(kernel='poly', degree=3, C=C).fit(X, Y)
lin_svc = svm.LinearSVC(C=C).fit(X, Y)
# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
for i, clf in enumerate((svc, rbf_svc, poly_svc, lin_svc)):
# Plot the decision boundary. For that, we will asign a color to each
# point in the mesh [x_min, m_max]x[y_min, y_max].
我想了解的一切都在这里绘制图
Everything to plot chart goes here, how I understood:
pl.subplot(2, 2, i + 1)
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
# Put the result into a color plot
Z = Z.reshape(xx.shape)
pl.contourf(xx, yy, Z, cmap=pl.cm.Paired)
pl.axis('off')
# Plot also the training points
pl.scatter(X[:, 0], X[:, 1], c=Y, cmap=pl.cm.Paired)
pl.show()
有人可以用言语解释这种绘图是如何工作的吗?
Can someone explain in words how this plotting works?
推荐答案
基本上,您正在绘制函数f : R^2 -> {0,1}
,因此它是从二维空间到只有两个值的退化空间的函数-0
和1
.
Basically, you are plotting the function f : R^2 -> {0,1}
so it is a function from the 2 dimensional space into the degenerated space of only two values - 0
and 1
.
首先,生成要在其上可视化功能的网格.在使用f(x)=y
的示例中,您将选择某个间隔[x_min,x_max]
,在该间隔上将获取距离为eps
的点,并绘制相应的f
First, you generate the mesh you want to visualize your function on. In case of your example with f(x)=y
you would select some interval [x_min,x_max]
on which you would take points with some distance eps
and plot the corresponding values of f
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))
接下来,我们计算函数值,在本例中为SVM.predict
函数,其结果为0
或1
Next, we calculate the function values, in our case it is a SVM.predict
function, which results in either 0
or 1
Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])
与您为所有分析的x
现在,可能导致误解的棘手"部分是
Now, the "tricky" part which may lead to missunderstanding is
pl.contourf(xx, yy, Z, cmap=pl.cm.Paired)
此函数绘制f
函数的轮廓.为了在平面上可视化3D功能,通常会创建轮廓图,就像绘制功能高度图一样.如果在点周围检测到f
值的较大变化,则可以在点之间画一条线.
This function plots the contours of your f
function. To visualize 3 dimensional function on the plane one often creates contour plots, it is like a map of height of your function. You draw a line between points if the large change in the value of f
is detected around them.
来自数学世界的漂亮示例
Nice example from the mathworld
显示此类情节的示例.
对于SVM,我们只有两个可能的值-0
和1
,因此,等高线恰好位于2d空间的这些部分中,其中一方面是f(x)=0
,另一方面是f(x)=1
.因此,即使看起来像是"2d图",也并非如此-您可以观察到的这种形状(决策边界)是3d函数中最大差异的可视化.
In the case of SVM we have only two possible values - 0
and 1
, so as the result, the contour lines are located exactly in these parts of your 2d space, where on one side we have f(x)=0
and on the other f(x)=1
. So even though it seems like a "2d plot" it is not - this shape, that you can observe (the decision boundary) is a visualization of the biggest differences in the 3d function.
在sklearn
文档中,对于多分类示例,当具有f : R^2 -> {0,1,2}
时,会对其可视化,因此想法完全相同,但是在f(x1)!=f(x2)
的相邻x1和x2之间绘制了轮廓.
In sklearn
documentation that visualize it for the multi-classification example, when we have f : R^2 -> {0,1,2}
, so the idea is exactly the same, but contour is plotted between such adjacent x1 and x2 that f(x1)!=f(x2)
.
这篇关于无法理解SVM和LR中的决策边界图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!