如何在 scikit-learn 中创建/自定义您自己的评分器功能? [英] How to create/customize your own scorer function in scikit-learn?

查看:69
本文介绍了如何在 scikit-learn 中创建/自定义您自己的评分器功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 支持向量回归 作为GridSearchCV 中的估算器.但我想更改误差函数:我想定义自己的自定义误差函数,而不是使用默认值(R 平方:决定系数).

I am using Support Vector Regression as an estimator in GridSearchCV. But I want to change the error function: instead of using the default (R-squared: coefficient of determination), I would like to define my own custom error function.

我尝试用 make_scorer 制作一个,但没有成功.

I tried to make one with make_scorer, but it didn't work.

我阅读了文档,发现可以创建 自定义估算器,但我不需要重新制作整个估算器——只需要重新制作错误/评分函数.

I read the documentation and found that it's possible to create custom estimators, but I don't need to remake the entire estimator - only the error/scoring function.

我想我可以通过将可调用对象定义为得分手来实现,就像 文档.

I think I can do it by defining a callable as a scorer, like it says in the docs.

但我不知道如何使用估算器:就我而言是 SVR.我是否必须切换到分类器(例如 SVC)?我将如何使用它?

But I don't know how to use an estimator: in my case SVR. Would I have to switch to a classifier (such as SVC)? And how would I use it?

我的自定义错误函数如下:

My custom error function is as follows:

def my_custom_loss_func(X_train_scaled, Y_train_scaled):
    error, M = 0, 0
    for i in range(0, len(Y_train_scaled)):
        z = (Y_train_scaled[i] - M)
        if X_train_scaled[i] > M and Y_train_scaled[i] > M and (X_train_scaled[i] - Y_train_scaled[i]) > 0:
            error_i = (abs(Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(z))
        if X_train_scaled[i] > M and Y_train_scaled[i] > M and (X_train_scaled[i] - Y_train_scaled[i]) < 0:
            error_i = -(abs((Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(z)))
        if X_train_scaled[i] > M and Y_train_scaled[i] < M:
            error_i = -(abs(Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(-z))
    error += error_i
    return error

变量 M 不为空/零.为简单起见,我只是将其设置为零.

The variable M isn't null/zero. I've just set it to zero for simplicity.

谁能展示这个自定义评分函数的示例应用程序?感谢您的帮助!

Would anyone be able to show an example application of this custom scoring function? Thanks for your help!

推荐答案

如您所见,这是通过使用 make_scorer (docs).

As you saw, this is done by using make_scorer (docs).

from sklearn.grid_search import GridSearchCV
from sklearn.metrics import make_scorer
from sklearn.svm import SVR

import numpy as np

rng = np.random.RandomState(1)

def my_custom_loss_func(X_train_scaled, Y_train_scaled):
    error, M = 0, 0
    for i in range(0, len(Y_train_scaled)):
        z = (Y_train_scaled[i] - M)
        if X_train_scaled[i] > M and Y_train_scaled[i] > M and (X_train_scaled[i] - Y_train_scaled[i]) > 0:
            error_i = (abs(Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(z))
        if X_train_scaled[i] > M and Y_train_scaled[i] > M and (X_train_scaled[i] - Y_train_scaled[i]) < 0:
            error_i = -(abs((Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(z)))
        if X_train_scaled[i] > M and Y_train_scaled[i] < M:
            error_i = -(abs(Y_train_scaled[i] - X_train_scaled[i]))**(2*np.exp(-z))
    error += error_i
    return error

# Generate sample data
X = 5 * rng.rand(10000, 1)
y = np.sin(X).ravel()

# Add noise to targets
y[::5] += 3 * (0.5 - rng.rand(X.shape[0]/5))

train_size = 100

my_scorer = make_scorer(my_custom_loss_func, greater_is_better=True)

svr = GridSearchCV(SVR(kernel='rbf', gamma=0.1),
                   scoring=my_scorer,
                   cv=5,
                   param_grid={"C": [1e0, 1e1, 1e2, 1e3],
                               "gamma": np.logspace(-2, 2, 5)})

svr.fit(X[:train_size], y[:train_size])

print svr.best_params_
print svr.score(X[train_size:], y[train_size:])

这篇关于如何在 scikit-learn 中创建/自定义您自己的评分器功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆