Kernlab疯狂:相同问题的结果不一致 [英] Kernlab kraziness: inconsistent results for identical problems

查看:121
本文介绍了Kernlab疯狂:相同问题的结果不一致的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在kernlab软件包中发现了一些令人费解的行为:估算数学上相同的SVM在软件中会产生不同的结果.

I've found some puzzling behavior in the kernlab package: estimating SVMs which are mathematically identical produces different results in software.

此代码段仅提取虹膜数据,为简单起见使其成为二进制分类问题.如您所见,我在两个SVM中都使用线性内核.

This code snippet just takes the iris data and makes it a binary classification problem for the sake of simplicity. As you can see, I'm using linear kernels in both SVMs.

library(kernlab)
library(e1071)

data(iris)
x <- as.matrix(iris[, 1:4])
y <- as.factor(ifelse(iris[, 5] == 'versicolor', 1, -1))
C <- 5.278031643091578

svm1 <- ksvm(x = x, y = y, scaled = FALSE, kernel = 'vanilladot', C = C)

K <- kernelMatrix(vanilladot(), x)
svm2 <- ksvm(x = K, y = y, C = C, kernel = 'matrix')

svm3 <- svm(x = x, y = y, scale = FALSE, kernel = 'linear', cost = C)

但是,svm1和svm2的摘要信息完全不同:kernlab报告的两个模型之间的支持向量计数,训练错误率和目标函数值完全不同.

However, the summary information of svm1 and svm2 are dramatically different: kernlab reports completely different support vector counts, training error rates, and objective function values between the two models.

> svm1
Support Vector Machine object of class "ksvm" 

SV type: C-svc  (classification) 
 parameter : cost C = 5.27803164309158 

Linear (vanilla) kernel function. 

Number of Support Vectors : 89 

Objective Function Value : -445.7911 
Training error : 0.26 
> svm2
Support Vector Machine object of class "ksvm" 

SV type: C-svc  (classification) 
 parameter : cost C = 5.27803164309158 

[1] " Kernel matrix used as input."

Number of Support Vectors : 59 

Objective Function Value : -292.692 
Training error : 0.333333

为了进行比较,我还使用e1071计算了相同的模型,该模型为libsvm软件包提供了R接口.

For the sake of comparison, I also computed the same model using e1071, which provides an R interface for the libsvm package.

svm3

Call:
svm.default(x = x, y = y, scale = FALSE, kernel = "linear", cost = C)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  linear 
       cost:  5.278032 
      gamma:  0.25 

Number of Support Vectors:  89

It reports 89 support vectors, the same as svm1.

我的问题是,kernlab软件包中是否存在任何已知的bug可以解释这种异常行为.

My question is whether there are any known bugs in the kernlab package which can account for this unusual behavior.

(Kernlab for R是一种SVM求解器,允许一个人使用几个预打包的内核函数之一或用户提供的内核矩阵.输出是用户提供的超参数的支持向量机的估计值. )

(Kernlab for R is an SVM solver that allows one to use one of several pre-packaged kernel functions, or a user-supplied kernel matrix. The output is an estimate of a support vector machine for the user-supplied hyperparameters.)

推荐答案

查看一些代码,看来这是有问题的行:

Reviewing some of the code, it appears that this is the offending line:

https://github.com/cran/kernlab/blob/efd7d91521b439a993efb49cf8e71b57fae5fc5a/src/svm.cpp#L4205

也就是说,在用户提供的内核矩阵的情况下,ksvm只是在看两个维度,而不管输入的维度是什么.这似乎很奇怪,并且可能是某些测试或其他测试的结果.仅用二维数据对线性核进行的测试将产生相同的结果:在上文中用1:2替换1:4,并且输出和预测都一致.

That is, in the case of a user-supplied kernel matrix, the ksvm is just looking at two dimensions, rather than whatever the dimensionality of the input is. This seems strange, and is probably a hold-over from some testing or whatever. Tests of the linear kernel with data of just two dimensions produces the same result: replace 1:4 with 1:2 in the above and the output and predictions all agree.

这篇关于Kernlab疯狂:相同问题的结果不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆