使sklearn中的网格搜索功能忽略空模型 [英] Make grid search functions in sklearn to ignore empty models
问题描述
使用 python 和 scikit-learn,我想做一个网格搜索.但是我的一些模型最终是空的.如何让网格搜索功能忽略这些模型?
Using python and scikit-learn, I'd like to do a grid search. But some of my models end up being empty. How can I make the grid search function to ignore those models?
我想我可以有一个评分函数,如果模型为空则返回 0,但我不确定如何.
I guess I can have a scoring function which returns 0 if the models is empty, but I'm not sure how.
predictor = sklearn.svm.LinearSVC(penalty='l1', dual=False, class_weight='auto')
param_dist = {'C': pow(2.0, np.arange(-10, 11))}
learner = sklearn.grid_search.GridSearchCV(estimator=predictor,
param_grid=param_dist,
n_jobs=self.n_jobs, cv=5,
verbose=0)
learner.fit(X, y)
我的数据的方式是这个 learner
对象将选择一个 C
对应于一个空模型.知道如何确保模型不为空吗?
My data's in a way that this learner
object will choose a C
corresponding to an empty model. Any idea how I can make sure the model's not empty?
编辑:空模型"是指选择了 0 个要使用的功能的模型.特别是对于 l1
正则化模型,这很容易发生.所以在这种情况下,如果SVM中的C
足够小,优化问题就会找到0向量作为系数的最优解.因此 predictor.coef_
将是 0
s 的向量.
EDIT: by an "empty model" I mean a model that has selected 0 features to use. Specially with an l1
regularized model, this can easily happen. So in this case, if the C
in the SVM is small enough, the optimization problem will find the 0 vector as the optimal solution for the coefficients. Therefore predictor.coef_
will be a vector of 0
s.
推荐答案
尝试实现自定义评分器,类似于:
Try to implement custom scorer, something similar to:
import numpy as np
def scorer_(estimator, X, y):
# Your criterion here
if np.allclose(estimator.coef_, np.zeros_like(estimator.coef_)):
return 0
else:
return estimator.score(X, y)
learner = sklearn.grid_search.GridSearchCV(...
scoring=scorer_)
这篇关于使sklearn中的网格搜索功能忽略空模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!