为什么LinearSVC不能进行这种简单分类? [英] Why can't LinearSVC do this simple classification?

查看:300
本文介绍了为什么LinearSVC不能进行这种简单分类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用scikit-learn中的LinearSVC对象进行以下简单分类.我试过同时使用0.10和0.14版本.使用代码:

I'm trying to do the following simple classification using the LinearSVC object in scikit-learn. I've tried using both version 0.10 and 0.14. Using the code:

from sklearn.svm import LinearSVC, SVC
from numpy import *

data = array([[ 1007.,  1076.],
              [ 1017.,  1009.],
              [ 2021.,  2029.],
              [ 2060.,  2085.]])
groups = array([1, 1, 2, 2])

svc = LinearSVC()
svc.fit(data, groups)
svc.predict(data)

我得到输出:

array([2, 2, 2, 2])

但是,如果我将分类器替换为

However, if I replace the classifier with

svc = SVC(kernel='linear')

然后我得到结果

array([ 1.,  1.,  2.,  2.])

这是正确的.有谁知道为什么使用LinearSVC会破坏这个简单的问题?

which is correct. Does anyone know why using LinearSVC would botch this simple problem?

推荐答案

LinearSVC所基于的算法对输入中的极高值非常敏感:

The algorithm underlying LinearSVC is very sensitive to extreme values in its input:

>>> svc = LinearSVC(verbose=1)
>>> svc.fit(data, groups)
[LibLinear]....................................................................................................
optimization finished, #iter = 1000

WARNING: reaching max number of iterations
Using -s 2 may be faster (also see FAQ)

Objective value = -0.001256
nSV = 4
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='l2', multi_class='ovr', penalty='l2',
     random_state=None, tol=0.0001, verbose=1)

(警告是指LibLinear FAQ,因为scikit-learn的LinearSVC基于该库.)

(The warning refers to the LibLinear FAQ, since scikit-learn's LinearSVC is based on that library.)

您应该先进行归一化:

>>> from sklearn.preprocessing import scale
>>> data = scale(data)
>>> svc.fit(data, groups)
[LibLinear]...
optimization finished, #iter = 39
Objective value = -0.240988
nSV = 4
LinearSVC(C=1.0, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='l2', multi_class='ovr', penalty='l2',
     random_state=None, tol=0.0001, verbose=1)
>>> svc.predict(data)
array([1, 1, 2, 2])

这篇关于为什么LinearSVC不能进行这种简单分类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆