Keras:定制损失函数,训练数据与模型不直接相关 [英] Keras: Custom loss function with training data not directly related to model

查看:169
本文介绍了Keras:定制损失函数,训练数据与模型不直接相关的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将使用tensorflow层编写的CNN转换为在tensorflow中使用keras api(我使用的是TF 1.x提供的keras api),并且在编写自定义损失函数来训练模型时遇到问题

I am trying to convert my CNN written with tensorflow layers to use the keras api in tensorflow (I am using the keras api provided by TF 1.x), and am having issue writing a custom loss function, to train the model.

根据本指南,在定义损失函数时,需要参数(y_true, y_pred) https://www.tensorflow.org/guide/keras/train_and_evaluate#custom_losses

According to this guide, when defining a loss function it expects the arguments (y_true, y_pred) https://www.tensorflow.org/guide/keras/train_and_evaluate#custom_losses

def basic_loss_function(y_true, y_pred):
    return ...

但是,在我看到的每个示例中,y_true都与模型直接相关(在简单情况下,它是网络的输出).在我的问题中,事实并非如此.如果我的损失函数依赖于一些与模型张量无关的训练数据,该如何实现呢?

However, in every example I have seen, y_true is somehow directly related to the model (in the simple case it is the output of the network). In my problem, this is not the case. How do implement this if my loss function depends on some training data that is unrelated to the tensors of the model?

具体来说,这是我的问题:

To be concrete, here is my problem:

我正在尝试学习在图像对上训练的图像嵌入.我的训练数据包括图像对和图像对之间的匹配点注释(图像坐标).输入特征仅是图像对,并且网络以暹罗配置进行训练.

I am trying to learn an image embedding trained on pairs of images. My training data includes image pairs and annotations of matching points between the image pairs (image coordinates). The input feature is only the image pairs, and the network is trained in a siamese configuration.

我能够在tensorflow层上成功实现此目标,并通过tensorflow估计器对其进行成功培训. 我当前的实现是从tf Records的大型数据库中构建tf数据集,其中的功能是一个字典,其中包含图像和匹配点的数组.在我可以轻松地将这些图像坐标数组提供给损失函数之前,但是目前尚不清楚如何做到这一点.

I am able to implement this successfully with tensorflow layers and train it sucesfully with tensorflow estimators. My current implementations builds a tf Dataset from a large database of tf Records, where the features is a dictionary containing the images and arrays of matching points. Before I could easily feed these arrays of image coordinates to the loss function, but here it is unclear how to do so.

推荐答案

我经常使用一种技巧,即通过Lambda层来计算模型内的损耗. (例如,当损失与真实数据无关时,并且该模型实际上没有要比较的输出)

There is a hack I often use that is to calculate the loss within the model, by means of Lambda layers. (When the loss is independent from the true data, for instance, and the model doesn't really have an output to be compared)

在功能性API模型中:

In a functional API model:

def loss_calc(x):
    loss_input_1, loss_input_2 = x #arbirtray inputs, you choose
                                   #according to what you gave to the Lambda layer

    #here you use some external data that doesn't relate to the samples
    externalData = K.constant(external_numpy_data)


    #calculate the loss
    return the loss

使用模型本身的输出(损失中使用的张量)

Using the outputs of the model itself (the tensor(s) that are used in your loss)

loss = Lambda(loss_calc)([model_output_1, model_output_2])

创建输出损失而不是输出的模型:

Create the model outputting the loss instead of the outputs:

model = Model(inputs, loss)

创建一个虚拟的keras损失函数进行编译:

Create a dummy keras loss function for compilation:

def dummy_loss(y_true, y_pred):
    return y_pred #where y_pred is the loss itself, the output of the model above

model.compile(loss = dummy_loss, ....)

使用任何大小正确的虚拟数组进行训练,其样本数量将被忽略:

Use any dummy array correctly sized regarding number of samples for training, it will be ignored:

model.fit(your_inputs, np.zeros((number_of_samples,)), ...)


另一种方法是使用自定义训练循环.


Another way of doing it, is using a custom training loop.

这还需要更多工作.

尽管您正在使用TF1,但仍可以在以下位置打开急于执行代码的开头,并像在TF2中一样进行操作. (tf.enable_eager_execution())

Although you're using TF1, you can still turn eager execution on at the very beginning of your code and do stuff like it's done in TF2. (tf.enable_eager_execution())

按照教程进行自定义训练循环: https://www.tensorflow.org/tutorials /customization/custom_training_walkthrough

Follow the tutorial for custom training loops: https://www.tensorflow.org/tutorials/customization/custom_training_walkthrough

在这里,您可以自己计算出与所需结果有关的任何结果的梯度.这意味着您不需要遵循Keras的培训标准.

Here, you calculate the gradients yourself, of any result regarding whatever you want. This means you don't need to follow Keras standards of training.

最后,您可以使用建议的model.add_loss方法. 在这种情况下,您将以与第一个答案相同的方式计算损失损失.并将这个损失张量传递给add_loss.

Finally, you can use the approach you suggested of model.add_loss. In this case, you calculate the loss exaclty the same way I did in the first answer. And pass this loss tensor to add_loss.

您可能可以使用loss=None然后(不确定)来编译模型,因为您将使用其他损失,而不是标准损失.

You can probably compile a model with loss=None then (not sure), because you're going to use other losses, not the standard one.

在这种情况下,模型的输出也可能也是None,因此您应该使用y=None.

In this case, your model's output will probably be None too, and you should fit with y=None.

这篇关于Keras:定制损失函数,训练数据与模型不直接相关的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆