如何使用自定义 SVM 内核? [英] How to use a custom SVM kernel?

查看:32
本文介绍了如何使用自定义 SVM 内核?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用 Python 实现我自己的高斯内核,仅供练习.我正在使用:sklearn.svm.SVC(kernel=my_kernel) 但我真的不明白发生了什么.

I'd like to implement my own Gaussian kernel in Python, just for exercise. I'm using: sklearn.svm.SVC(kernel=my_kernel) but I really don't understand what is going on.

我希望函数 my_kernel 以 X 矩阵的列作为参数被调用,而不是我用 X, X 调用它> 作为参数.看例子就不清楚了.

I expect the function my_kernel to be called with the columns of the X matrix as parameters, instead I got it called with X, X as arguments. Looking at the examples things are not clearer.

我错过了什么?

这是我的代码:

'''
Created on 15 Nov 2014

@author: Luigi
'''
import scipy.io
import numpy as np
from sklearn import svm
import matplotlib.pyplot as plt

def svm_class(fileName):

    data = scipy.io.loadmat(fileName)
    X = data['X']
    y = data['y']

    f = svm.SVC(kernel = 'rbf', gamma=50, C=1.0)
    f.fit(X,y.flatten())
    plotData(np.hstack((X,y)), X, f)

    return

def plotData(arr, X, f):

    ax = plt.subplot(111)

    ax.scatter(arr[arr[:,2]==0][:,0], arr[arr[:,2]==0][:,1], c='r', marker='o', label='Zero')
    ax.scatter(arr[arr[:,2]==1][:,0], arr[arr[:,2]==1][:,1], c='g', marker='+', label='One')

    h = .02  # step size in the mesh
    # create a mesh to plot in
    x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
                         np.arange(y_min, y_max, h))


    # Plot the decision boundary. For that, we will assign a color to each
    # point in the mesh [x_min, m_max]x[y_min, y_max].
    Z = f.predict(np.c_[xx.ravel(), yy.ravel()])

    # Put the result into a color plot
    Z = Z.reshape(xx.shape)
    plt.contour(xx, yy, Z)



    plt.xlim(np.min(arr[:,0]), np.max(arr[:,0]))
    plt.ylim(np.min(arr[:,1]), np.max(arr[:,1]))
    plt.show()
    return


def gaussian_kernel(x1,x2):
    sigma = 0.5
    return np.exp(-np.sum((x1-x2)**2)/(2*sigma**2))

if __name__ == '__main__':

    fileName = 'ex6data2.mat'
    svm_class(fileName)

推荐答案

阅读上述答案后,以及其他一些问题和网站 (1, 2, 3, 4, 5),我把它放在 svm.SVC().

After reading the answer above, and some other questions and sites (1, 2, 3, 4, 5), I put this together for a gaussian kernel in svm.SVC().

使用 kernel=precomputed 调用 svm.SVC().

然后计算一个 Gram Matrix 又名内核矩阵(通常缩写为 K).

Then compute a Gram Matrix a.k.a. Kernel Matrix (often abbreviated as K).

然后使用这个 Gram 矩阵作为第一个参数(ie X)到 svm.SVC().fit():

Then use this Gram Matrix as the first argument (i.e. X) to svm.SVC().fit():

我从以下代码:

C=0.1
model = svmTrain(X, y, C, "gaussian")

svmTrain(),然后 sklearn.svm.SVC().fit():

that calls sklearn.svm.SVC() in svmTrain(), and then sklearn.svm.SVC().fit():

from sklearn import svm

if kernelFunction == "gaussian":
    clf = svm.SVC(C = C, kernel="precomputed")
    return clf.fit(gaussianKernelGramMatrix(X,X), y)

Gram 矩阵计算 - 用作 sklearn.svm.SVC().fit() 的参数 - 在 gaussianKernelGramMatrix():

the Gram Matrix computation - used as a parameter to sklearn.svm.SVC().fit() - is done in gaussianKernelGramMatrix():

import numpy as np

def gaussianKernelGramMatrix(X1, X2, K_function=gaussianKernel):
    """(Pre)calculates Gram Matrix K"""

    gram_matrix = np.zeros((X1.shape[0], X2.shape[0]))
    for i, x1 in enumerate(X1):
        for j, x2 in enumerate(X2):
            gram_matrix[i, j] = K_function(x1, x2)
    return gram_matrix

使用 gaussianKernel() 获得 x1 和 x2 之间的径向基函数内核 (基于以 x1 为中心且 sigma=0.1 的高斯分布的相似性度量):

def gaussianKernel(x1, x2, sigma=0.1):

    # Ensure that x1 and x2 are column vectors
    x1 = x1.flatten()
    x2 = x2.flatten()

    sim = np.exp(- np.sum( np.power((x1 - x2),2) ) / float( 2*(sigma**2) ) )

    return sim

然后,一旦使用此自定义内核训练模型,我们将使用 测试数据和训练数据之间的[自定义]内核":

Then, once the model is trained with this custom kernel, we predict with "the [custom] kernel between the test data and the training data":

predictions = model.predict( gaussianKernelGramMatrix(Xval, X) )

简而言之,要使用自定义 SVM 高斯内核,您可以使用以下代码段:

In short, to use a custom SVM gaussian kernel, you can use this snippet:

import numpy as np
from sklearn import svm

def gaussianKernelGramMatrixFull(X1, X2, sigma=0.1):
    """(Pre)calculates Gram Matrix K"""

    gram_matrix = np.zeros((X1.shape[0], X2.shape[0]))
    for i, x1 in enumerate(X1):
        for j, x2 in enumerate(X2):
            x1 = x1.flatten()
            x2 = x2.flatten()
            gram_matrix[i, j] = np.exp(- np.sum( np.power((x1 - x2),2) ) / float( 2*(sigma**2) ) )
    return gram_matrix

X=...
y=...
Xval=...

C=0.1
clf = svm.SVC(C = C, kernel="precomputed")
model = clf.fit( gaussianKernelGramMatrixFull(X,X), y )

p = model.predict( gaussianKernelGramMatrixFull(Xval, X) )

这篇关于如何使用自定义 SVM 内核?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆