Keras 中 add_loss 函数的目的是什么? [英] What is the purpose of the add_loss function in Keras?

查看:39
本文介绍了Keras 中 add_loss 函数的目的是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我偶然发现了变分自编码器,并尝试使用 keras 使它们在 MNIST 上工作.我在 github 上找到了一个教程.

Currently I stumbled across variational autoencoders and tried to make them work on MNIST using keras. I found a tutorial on github.

我的问题涉及以下代码行:

My question concerns the following lines of code:

# Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
vae_loss = K.mean(xent_loss + kl_loss)

# Compile
vae.add_loss(vae_loss)
vae.compile(optimizer='rmsprop')

为什么使用 add_loss 而不是将其指定为编译选项?vae.compile(optimizer='rmsprop', loss=vae_loss) 之类的东西似乎不起作用并引发以下错误:

Why is add_loss used instead of specifying it as compile option? Something like vae.compile(optimizer='rmsprop', loss=vae_loss) does not seem to work and throws the following error:

ValueError: The model cannot be compiled because it has no loss to optimize.

这个函数和我可以添加为 Model.fit() 参数的自定义损失函数有什么区别?

What is the difference between this function and a custom loss function, that I can add as an argument for Model.fit()?

提前致谢!

P.S.:我知道 github 上有几个与此相关的问题,但其中大部分都是公开的且未注释.如果这已经解决了,请分享链接!

P.S.: I know there are several issues concerning this on github, but most of them were open and uncommented. If this has been resolved already, please share the link!

我删除了将损失添加到模型的行,并使用了编译函数的损失参数.现在看起来像这样:

I removed the line which adds the loss to the model and used the loss argument of the compile function. It looks like this now:

# Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
vae_loss = K.mean(xent_loss + kl_loss)

# Compile
vae.compile(optimizer='rmsprop', loss=vae_loss)

这会抛出一个类型错误:

This throws an TypeError:

TypeError: Using a 'tf.Tensor' as a Python 'bool' is not allowed. Use 'if t is not None:' instead of 'if t:' to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.

<小时>

编辑 2

感谢@MarioZ 的努力,我找到了解决方法.


Edit 2

Thanks to @MarioZ's efforts, I was able to figure out a workaround for this.

# Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss in separate function
def vae_loss(x, x_decoded_mean):
    xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
    kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
    vae_loss = K.mean(xent_loss + kl_loss)
    return vae_loss

# Compile
vae.compile(optimizer='rmsprop', loss=vae_loss)

...

vae.fit(x_train, 
    x_train,        # <-- did not need this previously
    shuffle=True,
    epochs=epochs,
    batch_size=batch_size,
    validation_data=(x_test, x_test))     # <-- worked with (x_test, None) before

出于某种奇怪的原因,我不得不在拟合模型时明确指定 y 和 y_test.本来,我不需要这样做.生产的样品对我来说似乎是合理的.

For some strange reason, I had to explicitly specify y and y_test while fitting the model. Originally, I didn't need to do this. The produced samples seem reasonable to me.

虽然我可以解决这个问题,但我仍然不知道这两种方法的区别和缺点是什么(除了需要不同的语法).有人能给我更多的见解吗?

Although I could resolve this, I still don't know what the differences and disadvantages of these two methods are (other than needing a different syntax). Can someone give me more insight?

推荐答案

我将尝试回答为什么使用 model.add_loss() 而不是指定自定义损失函数的原始问题到 model.compile(loss=...).

I'll try to answer the original question of why model.add_loss() is being used instead of specifying a custom loss function to model.compile(loss=...).

Keras 中的所有损失函数总是采用两个参数 y_truey_pred.看看 Keras 中可用的各种标准损失函数的定义,它们都有这两个参数.它们是目标"(许多教科书中的 Y 变量)和模型的实际输出.大多数标准损失函数可以写成这两个张量的表达式.但是一些更复杂的损失不能这样写.对于您的 VAE 示例,情况就是这样,因为损失函数还取决于附加张量,即 z_log_varz_mean,它们对损失函数不可用.使用 model.add_loss() 没有这样的限制,并允许您编写依赖于许多其他张量的更复杂的损失,但它具有更多依赖模型的不便,而标准损失函数适用于任何模型.

All loss functions in Keras always take two parameters y_true and y_pred. Have a look at the definition of the various standard loss functions available in Keras, they all have these two parameters. They are the 'targets' (the Y variable in many textbooks) and the actual output of the model. Most standard loss functions can be written as an expression of these two tensors. But some more complex losses cannot be written in that way. For your VAE example this is the case because the loss function also depends on additional tensors, namely z_log_var and z_mean, which are not available to the loss functions. Using model.add_loss() has no such restriction and allows you to write much more complex losses that depend on many other tensors, but it has the inconvenience of being more dependent on the model, whereas the standard loss functions work with just any model.

(注意:此处其他答案中提出的代码有些作弊,因为它们只是使用全局变量来潜入额外的必需依赖项.这使得损失函数在数学意义上不是真正的函数.我认为这一点不那么干净的代码,我希望它更容易出错.)

(Note: The code proposed in other answers here are somewhat cheating in as much as they just use global variables to sneak in the additional required dependencies. This makes the loss function not a true function in the mathematical sense. I consider this to be much less clean code and I expect it to be more error-prone.)

这篇关于Keras 中 add_loss 函数的目的是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆