训练SVM分类器需要多少时间? [英] How much time does take train SVM classifier?

查看:1344
本文介绍了训练SVM分类器需要多少时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了以下代码,并在小数据上对其进行了测试:

I wrote following code and test it on small data:

classif = OneVsRestClassifier(svm.SVC(kernel='rbf'))
classif.fit(X, y)

其中X, y(X-30000x784矩阵,y-30000x1)是numpy数组.在小数据上,算法效果很好,并给出了正确的结果.

Where X, y (X - 30000x784 matrix, y - 30000x1) are numpy arrays. On small data algorithm works well and give me right results.

但是我大约10个小时前运行了我的程序...它仍在进行中.

But I run my program about 10 hours ago... And it is still in process.

我想知道它需要多长时间,或者它以某种方式卡住了? (笔记本电脑规格为4 GB内存,Core i5-480M)

I want to know how long it will take, or it stuck in some way? (Laptop specs 4 GB Memory, Core i5-480M)

推荐答案

SVM训练可以任意长,这取决于许多参数:

SVM training can be arbitrary long, this depends on dozens of parameters:

  • C参数-错误分类罚则越大,过程越慢
  • 内核-内核越复杂,进程越慢(rbf是预定义内核中最复杂的)
  • 数据大小/维度-同样,同样的规则
  • C parameter - greater the missclassification penalty, slower the process
  • kernel - more complicated the kernel, slower the process (rbf is the most complex from the predefined ones)
  • data size/dimensionality - again, the same rule

通常,基本的SMO算法是O(n^3),因此,在30 000数据点的情况下,它必须与2 700 000 000 000成正比的运算次数实在是很大.您有什么选择?

in general, basic SMO algorithm is O(n^3), so in case of 30 000 datapoints it has to run number of operations proportional to the2 700 000 000 000which is realy huge number. What are your options?

  • 将内核更改为线性内核,784个功能很多,rbf可能是多余的
  • 减少特征的维数(PCA?)
  • 降低C参数
  • 在数据子集上训练模型以找到合适的参数,然后在某个集群/超级计算机上训练整个参数
  • change a kernel to the linear one, 784 features is quite a lot, rbf can be redundant
  • reduce features' dimensionality (PCA?)
  • lower the C parameter
  • train model on the subset of your data to find the good parameters and then train the whole one on some cluster/supercomputer

这篇关于训练SVM分类器需要多少时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆