什么是 _passthrough_scorer 以及如何更改 GridsearchCV (sklearn) 中的记分器? [英] What is _passthrough_scorer and How Can I Change Scorers in GridsearchCV (sklearn)?

查看:61
本文介绍了什么是 _passthrough_scorer 以及如何更改 GridsearchCV (sklearn) 中的记分器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

http://scikit-learn.org/stable/modules/generated/sklearn.grid_search.GridSearchCV.html(供参考)

x = [[2], [1], [3], [1] ... ] # about 1000 data 
grid = GridSearchCV(KernelDensity(), {'bandwidth': np.linspace(0.1, 1.0, 10)}, cv=10)
grid.fit(x)

当我使用 GridSearchCV 而不指定像 那样的评分函数时,grid.scorer_ 的值为 .你能解释一下_passthrough_scorer是什么功能吗?

When I use GridSearchCV without specifying scoring function like the , the value of grid.scorer_ is . Could you explain what kind of function _passthrough_scorer is?

除此之外,我想将评分函数更改为 mean_squared_error 或其他内容.

In addition to this, I want to change the scoring function to mean_squared_error or something else.

grid = GridSearchCV(KernelDensity(), {'bandwidth': np.linspace(0.1, 1.0, 10)}, cv=10, scoring='mean_squared_error')

但是 grid.fit(x) 总是给我这个错误信息:

But the line, grid.fit(x), always gives me this error message:

TypeError: __call__() missing 1 required positional argument: 'y_true'

我不知道如何将 y_true 赋给函数,因为我不知道真正的分布.你能告诉我如何改变评分函数吗?感谢您的帮助.

I cannot figure out how to give y_true to the function because I do not know the true distribution. Would you tell me how to change scoring functions? I appreciate your help.

推荐答案

KernelDensity 的默认度量是 minkowski,其中 p=2,这是一个欧几里德度量.如果您不指定任何其他评分方法,GridSearchCV 将使用 KernelDensity 指标进行评分.

The default metric for KernelDensity is minkowski with p=2 which is a a euclidean metric. GridSearchCV will use KernelDensity metric for scoring if you do not assign any other scoring method.

均方误差的公式为:sum((y_true - y_estimated)^2)/n.你得到了错误,因为你需要一个 y_true 来计算它.

The formula for mean squared error is: sum((y_true - y_estimated)^2)/n. You got the error since you need to have a y_true to calculate it.

这是一个将 GridSearchCV 应用于 KernelDensity 的虚构示例:

Here is a made-up example of applying GridSearchCV to KernelDensity :

from sklearn.neighbors import KernelDensity
from sklearn.grid_search import GridSearchCV
import numpy as np

N = 20
X = np.concatenate((np.random.randint(0, 10, 50),
                    np.random.randint(5, 10, 50)))[:, np.newaxis]

params = {'bandwidth': np.logspace(-1.0, 1.0, 10)}
grid = GridSearchCV(KernelDensity(), params)
grid.fit(X)
print(grid.grid_scores_)
print('Best parameter: ',grid.best_params_)
print('Best score: ',grid.best_score_)
print('Best estimator: ',grid.best_estimator_)

输出为:

[mean: -96.94890, std: 100.60046, params: {'bandwidth': 0.10000000000000001},


 mean: -70.44643, std: 40.44537, params: {'bandwidth': 0.16681005372000587},
 mean: -71.75293, std: 18.97729, params: {'bandwidth': 0.27825594022071243},
 mean: -77.83446, std: 11.24102, params: {'bandwidth': 0.46415888336127786},
 mean: -78.65182, std: 8.72507, params: {'bandwidth': 0.774263682681127},
 mean: -79.78828, std: 6.98582, params: {'bandwidth': 1.2915496650148841},
 mean: -81.65532, std: 4.77806, params: {'bandwidth': 2.1544346900318834},
 mean: -86.27481, std: 2.71635, params: {'bandwidth': 3.5938136638046259},
 mean: -95.86093, std: 1.84887, params: {'bandwidth': 5.9948425031894086},
 mean: -109.52306, std: 1.71232, params: {'bandwidth': 10.0}]
 Best parameter:  {'bandwidth': 0.16681005372000587}
 Best score:  -70.4464315885
 Best estimator:  KernelDensity(algorithm='auto', atol=0, bandwidth=0.16681005372000587,
       breadth_first=True, kernel='gaussian', leaf_size=40,
       metric='euclidean', metric_params=None, rtol=0)

GridSeachCV 的有效评分方法通常需要 y_true.在您的情况下,您可能希望将 sklearn.KernelDensity 的指标更改为其他指标(例如到 sklearn.metrics.pairwise.pairwise_kernelssklearn.metrics.pairwise.pairwise_distances) 作为网格搜索将使用它们进行评分.

The valid scoring methods for GridSeachCV usually need y_true. In your case, you may want to change the metric of sklearn.KernelDensity to other metrics (for instance to sklearn.metrics.pairwise.pairwise_kernels, sklearn.metrics.pairwise.pairwise_distances) as grid search will use them for scoring.

这篇关于什么是 _passthrough_scorer 以及如何更改 GridsearchCV (sklearn) 中的记分器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆