Sklearn SVM:SVR 和 SVC,对每个输入得到相同的预测 [英] Sklearn SVM: SVR and SVC, getting the same prediction for every input

查看:77
本文介绍了Sklearn SVM:SVR 和 SVC,对每个输入得到相同的预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是代码的粘贴:SVM 示例代码

我查看了这个问题的其他几个答案......似乎这个问题的特定迭代有点不同.

I checked out a couple of the other answers to this problem...and it seems like this specific iteration of the problem is a bit different.

首先,我的输入是标准化的,每个点有五个输入.这些值的大小都是合理的(健康的 0.5s 和 0.7s 等——很少接近零或接近 1 个数字).

First off, my inputs are normalized, and I have five inputs per point. The values are all reasonably sized (healthy 0.5s and 0.7s etc--few near zero or near 1 numbers).

我有大约 70 个 x 输入对应于它们的 70 个 y 输入.y 输入也被归一化(它们是我的函数在每个时间步长后的百分比变化).

I have about 70 x inputs corresponding to their 70 y inputs. The y inputs are also normalized (they are percentage changes of my function after each time-step).

我初始化我的 SVR(和 SVC),训练它们,然后用 30 个样本外输入测试它们......并为每个输入获得完全相同的预测(并且输入正在以合理的数量变化--0.3、0.6、0.5 等).我认为分类器(至少)会有一些区别......

I initialize my SVR (and SVC), train them, and then test them with 30 out-of-sample inputs...and get the exact same prediction for every input (and the inputs are changing by reasonable amounts--0.3, 0.6, 0.5, etc.). I would think that the classifier (at least) would have some differentiation...

这是我得到的代码:

# train svr

my_svr = svm.SVR()
my_svr.fit(x_training,y_trainr)

# train svc

my_svc = svm.SVC()
my_svc.fit(x_training,y_trainc)


# predict regression

p_regression = my_svr.predict(x_test)
p_r_series = pd.Series(index=y_testing.index,data=p_regression)

# predict classification

p_classification = my_svc.predict(x_test)
p_c_series = pd.Series(index=y_testing_classification.index,data=p_classification)

这里是我输入的示例:

x_training = [[  1.52068627e-04   8.66880301e-01   5.08504362e-01   9.48082047e-01
7.01156322e-01],
              [  6.68130520e-01   9.07506250e-01   5.07182647e-01   8.11290634e-01
6.67756208e-01],
              ... x 70 ]

y_trainr = [-0.00723209 -0.01788079  0.00741741 -0.00200805 -0.00737761  0.00202704 ...]

y_trainc = [ 0.  0.  1.  0.  0.  1.  1.  0. ...]

并且 x_test 矩阵 (5x30) 与 x_training 矩阵在输入的幅度和方差方面相似...对于 y_testry_testc.

And the x_test matrix (5x30) is similar to the x_training matrix in terms of magnitudes and variance of inputs...same for y_testr and y_testc.

目前,所有测试的预测完全相同(回归为 0.00596,分类为 1...)

Currently, the predictions for all of the tests are exactly the same (0.00596 for the regression, and 1 for the classification...)

如何让 SVR 和 SVC 函数吐出相关预测?或者至少是基于输入的不同预测......

How do I get the SVR and SVC functions to spit out relevant predictions? Or at least different predictions based on the inputs...

至少,分类器应该能够做出选择.我的意思是,即使我没有为回归提供足够的维度...

At the very least, the classifier should be able to make choices. I mean, even if I haven't provided enough dimensions for regression...

推荐答案

尝试增加默认的 C.看来你是欠拟合了.

Try increasing your C from the default. It seems you are underfitting.

my_svc = svm.SVC(probability=True, C=1000)
my_svc.fit(x_training,y_trainc)

p_classification = my_svc.predict(x_test)

p_classification 然后变成:

p_classification then becomes:

array([ 1.,  0.,  1.,  0.,  1.,  1.,  1.,  1.,  1.,  1.,  0.,  0.,  0.,
        1.,  0.,  0.,  0.,  0.,  0.,  1.,  1.,  0.,  1.,  1.,  1.,  1.,
        1.,  1.,  1.,  1.])

对于 SVR 情况,您还需要降低 epsilon.

For the SVR case you will also want to reduce your epsilon.

my_svr = svm.SVR(C=1000, epsilon=0.0001)
my_svr.fit(x_training,y_trainr)

p_regression = my_svr.predict(x_test)

p_regression 然后变成:

p_regression then becomes:

array([-0.00430622,  0.00022762,  0.00595002, -0.02037147, -0.0003767 ,
        0.00212401,  0.00018503, -0.00245148, -0.00109994, -0.00728342,
       -0.00603862, -0.00321413, -0.00922082, -0.00129351,  0.00086844,
        0.00380351, -0.0209799 ,  0.00495681,  0.0070937 ,  0.00525708,
       -0.00777854,  0.00346639,  0.0070703 , -0.00082952,  0.00246366,
        0.03007465,  0.01172834,  0.0135077 ,  0.00883518,  0.00399232])

您应该考虑使用交叉验证来调整您的 C 参数,以便它能够在对您最重要的任何指标上表现最佳.您可能需要查看 GridSearchCV 来帮助您做到这一点.

You should look to tune your C parameter using cross validation so that it is able to perform best on whichever metric matters most to you. You may want to look at GridSearchCV to help you do this.

这篇关于Sklearn SVM:SVR 和 SVC,对每个输入得到相同的预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆