Keras中add_loss函数的作用是什么? [英] What is the purpose of the add_loss function in Keras?

查看:1895
本文介绍了Keras中add_loss函数的作用是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我偶然发现了变种自动编码器,并试图使它们使用keras在MNIST上工作.我在 github 上找到了一个教程.

我的问题涉及以下代码行:

 # Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
vae_loss = K.mean(xent_loss + kl_loss)

# Compile
vae.add_loss(vae_loss)
vae.compile(optimizer='rmsprop')
 

为什么使用add_loss而不是将其指定为编译选项? vae.compile(optimizer='rmsprop', loss=vae_loss)之类的东西似乎不起作用,并引发以下错误:

 ValueError: The model cannot be compiled because it has no loss to optimize.
 

此函数和自定义损失函数之间的区别是什么,我可以将其添加为Model.fit()的参数?

提前谢谢!

P.S .:我知道github上有几个与此有关的问题,但是其中大多数都是开放的,没有评论.如果此问题已解决,请共享链接!


编辑1

我删除了将损失添加到模型中的行,并使用了compile函数的loss参数.现在看起来像这样:

 # Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
vae_loss = K.mean(xent_loss + kl_loss)

# Compile
vae.compile(optimizer='rmsprop', loss=vae_loss)
 

这会引发TypeError:

 TypeError: Using a 'tf.Tensor' as a Python 'bool' is not allowed. Use 'if t is not None:' instead of 'if t:' to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.
 


编辑2

由于@MarioZ的努力,我得以找到解决此问题的方法.

 # Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss in separate function
def vae_loss(x, x_decoded_mean):
    xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
    kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
    vae_loss = K.mean(xent_loss + kl_loss)
    return vae_loss

# Compile
vae.compile(optimizer='rmsprop', loss=vae_loss)

...

vae.fit(x_train, 
    x_train,        # <-- did not need this previously
    shuffle=True,
    epochs=epochs,
    batch_size=batch_size,
    validation_data=(x_test, x_test))     # <-- worked with (x_test, None) before
 

出于某些奇怪的原因,我必须在拟合模型时明确指定y和y_test.最初,我不需要这样做.产生的样本对我来说似乎很合理.

尽管我可以解决此问题,但我仍然不知道这两种方法的区别和缺点(除了需要不同的语法外).有人可以给我更多的见识吗?

解决方案

我将尝试回答为什么使用model.add_loss()而不是为model.compile(loss=...)指定自定义损失函数的原始问题.

Keras中的所有损失函数始终采用两个参数y_truey_pred.看一下Keras中可用的各种标准损失函数的定义,它们都有这两个参数.它们是目标"(许多教科书中的Y变量)和模型的实际输出.大多数标准损失函数可以写为这两个张量的表达式.但是,有些更复杂的损失不能用这种方式来写.对于您的VAE示例,就是这种情况,因为损失函数还取决于其他张量,即z_log_varz_mean,这些张量对于损失函数不可用.使用model.add_loss()没有这样的限制,并且可以让您编写更复杂的,依赖于许多其他张量的损失,但是它具有更依赖于模型的不便,而标准损失函数仅适用于任何模型. >

(注意:此处其他答案中提出的代码在某种程度上是作弊的,因为它们只是使用全局变量来潜入其他必需的依赖项.这使得损失函数在数学上不是真正的函数.我认为这是减少了代码的简洁度,我希望它更容易出错.)

Currently I stumbled across variational autoencoders and tried to make them work on MNIST using keras. I found a tutorial on github.

My question concerns the following lines of code:

# Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
vae_loss = K.mean(xent_loss + kl_loss)

# Compile
vae.add_loss(vae_loss)
vae.compile(optimizer='rmsprop')

Why is add_loss used instead of specifying it as compile option? Something like vae.compile(optimizer='rmsprop', loss=vae_loss) does not seem to work and throws the following error:

ValueError: The model cannot be compiled because it has no loss to optimize.

What is the difference between this function and a custom loss function, that I can add as an argument for Model.fit()?

Thanks in advance!

P.S.: I know there are several issues concerning this on github, but most of them were open and uncommented. If this has been resolved already, please share the link!


Edit 1

I removed the line which adds the loss to the model and used the loss argument of the compile function. It looks like this now:

# Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
vae_loss = K.mean(xent_loss + kl_loss)

# Compile
vae.compile(optimizer='rmsprop', loss=vae_loss)

This throws an TypeError:

TypeError: Using a 'tf.Tensor' as a Python 'bool' is not allowed. Use 'if t is not None:' instead of 'if t:' to test if a tensor is defined, and use TensorFlow ops such as tf.cond to execute subgraphs conditioned on the value of a tensor.


Edit 2

Thanks to @MarioZ's efforts, I was able to figure out a workaround for this.

# Build model
vae = Model(x, x_decoded_mean)

# Calculate custom loss in separate function
def vae_loss(x, x_decoded_mean):
    xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
    kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
    vae_loss = K.mean(xent_loss + kl_loss)
    return vae_loss

# Compile
vae.compile(optimizer='rmsprop', loss=vae_loss)

...

vae.fit(x_train, 
    x_train,        # <-- did not need this previously
    shuffle=True,
    epochs=epochs,
    batch_size=batch_size,
    validation_data=(x_test, x_test))     # <-- worked with (x_test, None) before

For some strange reason, I had to explicitly specify y and y_test while fitting the model. Originally, I didn't need to do this. The produced samples seem reasonable to me.

Although I could resolve this, I still don't know what the differences and disadvantages of these two methods are (other than needing a different syntax). Can someone give me more insight?

解决方案

I'll try to answer the original question of why model.add_loss() is being used instead of specifying a custom loss function to model.compile(loss=...).

All loss functions in Keras always take two parameters y_true and y_pred. Have a look at the definition of the various standard loss functions available in Keras, they all have these two parameters. They are the 'targets' (the Y variable in many textbooks) and the actual output of the model. Most standard loss functions can be written as an expression of these two tensors. But some more complex losses cannot be written in that way. For your VAE example this is the case because the loss function also depends on additional tensors, namely z_log_var and z_mean, which are not available to the loss functions. Using model.add_loss() has no such restriction and allows you to write much more complex losses that depend on many other tensors, but it has the inconvenience of being more dependent on the model, whereas the standard loss functions work with just any model.

(Note: The code proposed in other answers here are somewhat cheating in as much as they just use global variables to sneak in the additional required dependencies. This makes the loss function not a true function in the mathematical sense. I consider this to be much less clean code and I expect it to be more error-prone.)

这篇关于Keras中add_loss函数的作用是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆