一个线性问题和非线性问题之间差?点积和核技巧的精华 [英] Difference between a linear problem and a non-linear problem? Essence of Dot-Product and Kernel trick

查看:117
本文介绍了一个线性问题和非线性问题之间差?点积和核技巧的精华的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

内核特技映射非线性问题转化为线性问题。

The kernel trick maps a non-linear problem into a linear problem.

我的问题是:
是1.什么线性和非线性问题的主要区别?这是这两个类问题的差异后面的直觉?而如何核技巧可以帮助使用线性分类的非线性问题?
2.为什么是积在两种情况下如此重要?

My questions are:
1. What is the main difference between a linear and a non-linear problem? What is the intuition behind the difference of these two classes of problem? And How does kernel trick helps use the linear classifiers on a non-linear problem?
2. Why is the dot product so important in the two cases?

感谢。

推荐答案

很多分类,其中线性支持向量机(SVM)时,只能解决线性可分的问题,即其中属于1类的点可以从属类别2由一个超平面的点来分离。

Many classifiers, among them the linear Support Vector Machine (SVM), can only solve problems that are linearly separable, i.e. where the points belonging to class 1 can be separated from the points belonging to class 2 by a hyperplane.

在许多情况下,这样的问题是不是线性分离可以通过施加解决的变换披()到数据点;这种变换是说改造点的特征空间的。希望的是,在特征空间中,点将线性可分。 (注:这不是核技巧尚未...敬请期待)

In many cases, a problem that is not linearly separable can be solved by applying a transform phi() to the data points; this transform is said to transform the points to feature space. The hope is that, in feature space, the points will be linearly separable. (Note: This is not the kernel trick yet... stay tuned.)

可以示出的是,该特征空间的维数越高,越大的是在该空间线性可分问题的数量。因此,人们会最好要特征空间是一样高维尽可能

It can be shown that, the higher the dimension of the feature space, the greater the number of problems that are linearly separable in that space. Therefore, one would ideally want the feature space to be as high-dimensional as possible.

不幸的是,特征空间的增加尺寸,所以确实所需的计算量。这是其中的核技巧进来。许多机器学习算法(它们之间的SVM)可以以这样的方式,它们在数据点上执行的唯一操作是两个数据点之间的标量积配制。 (我会表示X1和X2之间一个标产品通过< X1,X2>

Unfortunately, as the dimension of feature space increases, so does the amount of computation required. This is where the kernel trick comes in. Many machine learning algorithms (among them the SVM) can be formulated in such a way that the only operation they perform on the data points is a scalar product between two data points. (I will denote a scalar product between x1 and x2 by <x1, x2>.)

如果我们把我们的观点为特色的空间,数量积,现在看起来是这样的:

If we transform our points to feature space, the scalar product now looks like this:

&LT;披(1次),岛(X2)&GT;

的关键结论是,存在着一类函数调用的内核的,可用于优化这个标量积的计算。内核是一个函数 K(X1,X2)具有这样的性质:

The key insight is that there exists a class of functions called kernels that can be used to optimize the computation of this scalar product. A kernel is a function K(x1, x2) that has the property that

K(X1,X2)=&LT;披(1次),岛(X2)&GT;

对于一些函数披()。换句话说:我们可以评价在低维数据空间中的标量积(其中,X1和X2现场),而无需变换到高维特征空间(其中phi(1次)和phi(2次)活) - ,但是我们仍然得到转化为高维特征空间的好处。这就是所谓的核技巧的。

for some function phi(). In other words: We can evaluate the scalar product in the low-dimensional data space (where x1 and x2 "live") without having to transform to the high-dimensional feature space (where phi(x1) and phi(x2) "live") -- but we still get the benefits of transforming to the high-dimensional feature space. This is called the kernel trick.

许多流行的内核,如高斯核时,实际上对应于变换披(),该变换成一个 infinte维的功能空间。内核特技允许我们计算标量积在该空间,而无需重新在该空间present点明确(其中,很显然,是不可能用有限数量的内存的计算机上)。

Many popular kernels, such as the Gaussian kernel, actually correspond to a transform phi() that transforms into an infinte-dimensional feature space. The kernel trick allows us to compute scalar products in this space without having to represent points in this space explicitly (which, obviously, is impossible on computers with finite amounts of memory).

这篇关于一个线性问题和非线性问题之间差?点积和核技巧的精华的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆