通过梯度下降在每一步更新的自定义损失函数 [英] Custom loss function that updates at each step via gradient descent

查看:25
本文介绍了通过梯度下降在每一步更新的自定义损失函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从这个帖子,我们可以编写自定义损失函数.现在,假设自定义损失函数取决于参数 a:

def customLoss(yTrue,yPred):返回 (K.log(yTrue) - K.log(yPred))**2+a*yPred

我们如何像权重一样以梯度下降的方式在每一步更新参数 a?:

a_new= a_old - alpha *(关于 a 的自定义损失的导数)

附言真正的自定义损失与上述不同.请给我一个适用于任何任意自定义损失函数的通用答案,而不是上面示例的答案.

解决方案

创建一个自定义层来保存可训练参数.该层不会在其调用中返回输入,但我们将提供输入以符合您创建层的方式.

class TrainableLossLayer(Layer):def __init__(self, a_initializer, **kwargs):super(TrainableLossLayer, self).__init__(**kwargs)self.a_initializer = keras.initializers.get(a_initializer)#定义权重的方法def build(self, input_shape):self.kernel = self.add_weight(name='kernel_a',形状=(1,),初始值设定项=self.a_initializer,可训练=真)自建=真#method 定义层操作(只返回权重)定义调用(自我,输入):返回 self.kernel#输出形状def compute_output_shape(self, input_shape):返回 (1,)

使用模型中的层获取带有任何输入的 a(这与 Sequential 模型不兼容):

a = TrainableLossLayer(a_init, name="somename")(anyInput)

现在,您可以尝试以一种丑陋的方式定义您的损失:

def customLoss(yTrue,yPred):返回 (K.log(yTrue) - K.log(yPred))**2+a*yPred

如果这有效,那么它就准备好了.

<小时>

你也可以尝试更复杂的模型(如果你不想在loss中使用a像那样跳过层,这可能会导致模型保存/加载出现问题)

在这种情况下,您需要 y_train 作为输入而不是输出:

y_true_inputs = Input(...)

您的损失函数将进入一个 Lambda 层,正确获取所有参数:

def lambdaLoss(x):yTrue, yPred, alpha = x返回 (K.log(yTrue) - K.log(yPred))**2+alpha*yPredloss = Lambda(lambdaLoss)([y​​_true_inputs, original_model_outputs, a])

您的模型将输出此损失:

model = Model([original_model_inputs, y_true_inputs], loss)

您将拥有一个虚拟损失函数:

def dummyLoss(true, pred):返回 predmodel.compile(loss = dummyLoss, ...)

并训练为:

model.fit([x_train, y_train], nothing_maybe_None_or_np_zeros,....)

From this post, we can write a custom loss function. Now, assume that the custom loss function depends on parameter a:

def customLoss(yTrue,yPred):
    return (K.log(yTrue) - K.log(yPred))**2+a*yPred

How can we update parameter a at each step in a gradient descent manner like the weights?:

a_new= a_old - alpha * (derivative of custom loss with respect to a)

P.S. the real custom loss is different from the above. Please give me a general answer that works for any arbitrary custom loss function, not an answer to the example above.

解决方案

Create a custom layer to hold the trainable parameter. This layer will not return the inputs in its call, but we are going to have the inputs for complying with how you create layers.

class TrainableLossLayer(Layer):

    def __init__(self, a_initializer, **kwargs):
        super(TrainableLossLayer, self).__init__(**kwargs)
        self.a_initializer = keras.initializers.get(a_initializer)

    #method where weights are defined
    def build(self, input_shape):
        self.kernel = self.add_weight(name='kernel_a', 
                                  shape=(1,),
                                  initializer=self.a_initializer,
                                  trainable=True)
        self.built=True

    #method to define the layers operation (only return the weights)
    def call(self, inputs):
        return self.kernel

    #output shape
    def compute_output_shape(self, input_shape):
        return (1,)

Use the layer in your model to get a with any inputs (this is not compatible with a Sequential model):

a = TrainableLossLayer(a_init, name="somename")(anyInput)

Now, you can try to define your loss in a sort of ugly way:

def customLoss(yTrue,yPred):
    return (K.log(yTrue) - K.log(yPred))**2+a*yPred

If this works, then it's ready.


You can also try a more complicated model (if you don't want to use a in the loss jumping over the layers like that, this might cause problems in model saving/loading)

In this case, you will need that y_train goes in as an input instead of an output:

y_true_inputs = Input(...)

Your loss function will go into a Lambda layer taking all parameters properly:

def lambdaLoss(x):
    yTrue, yPred, alpha = x
    return (K.log(yTrue) - K.log(yPred))**2+alpha*yPred

loss = Lambda(lambdaLoss)([y_true_inputs, original_model_outputs, a])

Your model will output this loss:

model = Model([original_model_inputs, y_true_inputs], loss)

You will have a dummy loss function:

def dummyLoss(true, pred):
    return pred

model.compile(loss = dummyLoss, ...)

And train as:

model.fit([x_train, y_train], anything_maybe_None_or_np_zeros ,....)

这篇关于通过梯度下降在每一步更新的自定义损失函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆