scikit-learn (SVMLIB) 中奇怪的 SVM 预测性能 [英] Weird SVM prediction performance in scikit-learn (SVMLIB)

查看:60
本文介绍了scikit-learn (SVMLIB) 中奇怪的 SVM 预测性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 10000x1000 的大型数据集(10000 个具有 1000 个特征的对象)上使用来自 scikit-learn 的 SVC.我已经在其他来源中看到 SVMLIB 不能很好地扩展到超过 10000 个对象,我确实观察到了这一点:

I am using SVC from scikit-learn on a large dataset of 10000x1000 (10000 objects with 1000 features). I already saw in other sources that SVMLIB doesn't scale well beyond ~10000 objects and I indeed observe this:

training time for 10000 objects: 18.9s
training time for 12000 objects: 44.2s
training time for 14000 objects: 92.7s

你可以想象当我尝试 80000 时会发生什么.然而,我发现非常令人惊讶的是,SVM 的 predict() 比训练 fit() 花费更多的时间:

You can imagine what happens when I trying to 80000. However, what I found very surprising is the fact that the SVM's predict() takes even more time than the training fit():

prediction time for 10000 objects (model was also trained on those objects): 49.0s
prediction time for 12000 objects (model was also trained on those objects): 91.5s
prediction time for 14000 objects (model was also trained on those objects): 141.84s

让预测在线性时间内运行是微不足道的(尽管这里可能接近线性),并且通常比训练快得多.那么这里发生了什么?

It is trivial to get prediction to run in linear time (although it might be close to linear here), and usually it is much faster than training. So what is going on here?

推荐答案

您确定在您的预测时间度量中不包括训练时间吗?你有时间的代码片段吗?

Are you sure you do not include the training time in your measure of the prediction time? Do you have a code snippet for your timings?

这篇关于scikit-learn (SVMLIB) 中奇怪的 SVM 预测性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆