调整自定义损失函数以进行梯度提升分类 [英] Adjust custom loss function for gradient boosting classification

查看:259
本文介绍了调整自定义损失函数以进行梯度提升分类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经实现了梯度增强决策树来进行多类别分类.我的自定义损失函数如下所示:

I have implemented a gradient boosting decision tree to do a mulitclass classification. My custom loss functions look like this:

import numpy as np
from sklearn.preprocessing import OneHotEncoder
def softmax(mat):
    res = np.exp(mat)
    res = np.multiply(res, 1/np.sum(res, axis=1, keepdims=True))
    return res
def custom_asymmetric_objective(y_true, y_pred_encoded):
    pred = y_pred_encoded.reshape((-1, 3), order='F')
    pred = softmax(pred)
    y_true = OneHotEncoder(sparse=False,categories='auto').fit_transform(y_true.reshape(-1, 1))
    grad = (pred - y_true).astype("float")
    hess = 2.0 * pred * (1.0-pred)
    return grad.flatten('F'), hess.flatten('F')


def custom_asymmetric_valid(y_true, y_pred_encoded):
    y_true = OneHotEncoder(sparse=False,categories='auto').fit_transform(y_true.reshape(-1, 1)).flatten('F')
    margin = (y_true - y_pred_encoded).astype("float")
    loss = margin*10
    return "custom_asymmetric_eval", np.mean(loss), False

一切正常,但是现在我想通过以下方式调整损失函数:如果某项的分类不正确,它应该惩罚",并且应该为一定的约束加罚分(这是之前计算得出的,比方说惩罚是0,05,所以只是一个实数). 有什么办法可以同时考虑错误分类和惩罚值吗?

Everything works, but now I want to adjust my loss function in the following way: It should "penalize" if an item is classified incorrectly, and a penalty should be added for a certain constraint (this is calculated before, let's just say the penalty is e.g. 0,05, so just a real number). Is there any way to consider both, the misclassification and the penalty value?

推荐答案

尝试进行L2正则化:权重将在减去learning rate乘以error乘以x加上惩罚项lambda 以2的幂

Try L2 regularization: weights will be updated following the subtraction of a learning rate times error times x plus the penalty term lambda weight to the power of 2

简化:

这将是效果:

已添加:惩罚项(在等式右边)增加了模型的泛化能力.因此,如果您在训练集中过度拟合模型,则测试集中的性能将很差.因此,您要对训练集中的这些正确"分类进行惩罚,这些分类会在测试集中产生错误并影响泛化.

ADDED: The penalization term (on the right of equation) increases the generalization power of your model. So, if you overfit your model in training set, the perfomance will be poor in test set. So, you penalize these "right" classifications in training set that generate error in test set and compromise generalization.

这篇关于调整自定义损失函数以进行梯度提升分类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆