通过GridSearchCV()探索的svm.SVC()超参数的合适值范围是多少? [英] What is a good range of values for the svm.SVC() hyperparameters to be explored via GridSearchCV()?

查看:1631
本文介绍了通过GridSearchCV()探索的svm.SVC()超参数的合适值范围是多少?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了一个问题,我的svm.SVC()的超参数太宽,以致GridSearchCV()从未完成!一种想法是改为使用RandomizedSearchCV().但是同样,我的数据集相对较大,因此500次迭代大约需要1个小时!

I am running into the problem that the hyperparameters of my svm.SVC() are too wide such that the GridSearchCV() never gets completed! One idea is to use RandomizedSearchCV() instead. But again, my dataset is relative big such that 500 iterations take about 1 hour!

我的问题是,为了避免浪费资源,在GridSearchCV(或RandomizedSearchCV)中有什么好的设置(就每个超参数的值范围而言)?

My question is, what is a good set-up (in terms of the range of values for each hyperparameter) in GridSearchCV ( or RandomizedSearchCV ) in order to stop wasting resources?

换句话说,如何决定是否大于100的 C 值有意义和/或1的步长既不大也不小?很感谢任何形式的帮助.这是当前正在使用的设置:

In other words, how to decide whether or not e.g. C values above 100 make sense and/or step of 1 is neither big not small? Any help is very much appreciated. This is the set-up am currently using:

parameters = {
    'C':            np.arange( 1, 100+1, 1 ).tolist(),
    'kernel':       ['linear', 'rbf'],                   # precomputed,'poly', 'sigmoid'
    'degree':       np.arange( 0, 100+0, 1 ).tolist(),
    'gamma':        np.arange( 0.0, 10.0+0.0, 0.1 ).tolist(),
    'coef0':        np.arange( 0.0, 10.0+0.0, 0.1 ).tolist(),
    'shrinking':    [True],
    'probability':  [False],
    'tol':          np.arange( 0.001, 0.01+0.001, 0.001 ).tolist(),
    'cache_size':   [2000],
    'class_weight': [None],
    'verbose':      [False],
    'max_iter':     [-1],
    'random_state': [None],
    }

model = grid_search.RandomizedSearchCV( n_iter              = 500,
                                        estimator           = svm.SVC(),
                                        param_distributions = parameters,
                                        n_jobs              = 4,
                                        iid                 = True,
                                        refit               = True,
                                        cv                  = 5,
                                        verbose             = 1,
                                        pre_dispatch        = '2*n_jobs'
                                        )         # scoring = 'accuracy'
model.fit( train_X, train_Y )
print( model.best_estimator_ )
print( model.best_score_ )
print( model.best_params_ )

推荐答案

哪个内核效果最好取决于您的数据.您有多少样本和尺寸以及什么样的数据? 为了使范围具有可比性,您需要对数据进行归一化,通常,StandardScaler(均值和单位方差为零)是个好主意. 如果您的数据不是负数,则可以尝试使用MinMaxScaler.

Which kernel works best depends a lot on your data. What is the number of samples and dimensions and what kind of data do you have? For the ranges to be comparable, you need to normalize your data, often StandardScaler, which does zero mean and unit variance, is a good idea. If your data is non-negative, you might try MinMaxScaler.

对于kernel="gamma",我通常会这么做

{'C': np.logspace(-3, 2, 6), 'gamma': np.logspace(-3, 2, 6)}

这毫无根据,但最近几年对我有好处. 我强烈建议不要使用非对数网格,甚至建议不要使用离散参数进行随机搜索.随机搜索的主要优点之一是,您实际上可以使用连续分布 [参见文档] .

which is based on nothing but served me well the last couple of years. I would strongly advice against non-logarithmic grids, and even more though against randomized search using discrete parameters. One of the main advantages of randomized search is that you can actually search continuous parameters using continuous distributions [see the docs].

这篇关于通过GridSearchCV()探索的svm.SVC()超参数的合适值范围是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆