可变自动编码器:在Keras中进行预热 [英] Variationnal auto-encoder: implementing warm-up in Keras
问题描述
我最近阅读了这篇论文,其中介绍了一种称为预热"(WU)的过程,其中包括将KL散度中的损失乘以一个变量,该变量的值取决于历元数(从0到1线性变化)
I recently read this paper which introduces a process called "Warm-Up" (WU), which consists in multiplying the loss in the KL-divergence by a variable whose value depends on the number of epoch (it evolves linearly from 0 to 1)
我想知道这是否是这样做的好方法:
I was wondering if this is the good way to do that:
beta = K.variable(value=0.0)
def vae_loss(x, x_decoded_mean):
# cross entropy
xent_loss = K.mean(objectives.categorical_crossentropy(x, x_decoded_mean))
# kl divergence
for k in range(n_sample):
epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.,
std=1.0) # used for every z_i sampling
# Sample several layers of latent variables
for mean, var in zip(means, variances):
z_ = mean + K.exp(K.log(var) / 2) * epsilon
# build z
try:
z = tf.concat([z, z_], -1)
except NameError:
z = z_
except TypeError:
z = z_
# sum loss (using a MC approximation)
try:
loss += K.sum(log_normal2(z_, mean, K.log(var)), -1)
except NameError:
loss = K.sum(log_normal2(z_, mean, K.log(var)), -1)
print("z", z)
loss -= K.sum(log_stdnormal(z) , -1)
z = None
kl_loss = loss / n_sample
print('kl loss:', kl_loss)
# result
result = beta*kl_loss + xent_loss
return result
# define callback to change the value of beta at each epoch
def warmup(epoch):
value = (epoch/10.0) * (epoch <= 10.0) + 1.0 * (epoch > 10.0)
print("beta:", value)
beta = K.variable(value=value)
from keras.callbacks import LambdaCallback
wu_cb = LambdaCallback(on_epoch_end=lambda epoch, log: warmup(epoch))
# train model
vae.fit(
padded_X_train[:last_train,:,:],
padded_X_train[:last_train,:,:],
batch_size=batch_size,
nb_epoch=nb_epoch,
verbose=0,
callbacks=[tb, wu_cb],
validation_data=(padded_X_test[:last_test,:,:], padded_X_test[:last_test,:,:])
)
推荐答案
这将不起作用.我对其进行了测试,以弄清其为何不起作用.要记住的关键是Keras在训练开始时就创建了一个静态图.
This will not work. I tested it to figure out exactly why it was not working. The key thing to remember is that Keras creates a static graph once at the beginning of training.
因此,vae_loss
函数仅被调用一次以创建损耗张量,这意味着每次计算损耗时,对beta
变量的引用将保持不变.但是,您的warmup
函数将beta重新分配给新的K.variable
.因此,用于计算损失的beta
与要更新的beta
是不同的beta
,该值将始终为0.
Therefore, the vae_loss
function is called only once to create the loss tensor, which means that the reference to the beta
variable will remain the same every time the loss is calculated. However, your warmup
function reassigns beta to a new K.variable
. Thus, the beta
that is used for calculating loss is a different beta
than the one that gets updated, and the value will always be 0.
这是一个简单的解决方法.只需在您的warmup
回调中更改此行:
It is an easy fix. Just change this line in your warmup
callback:
beta = K.variable(value=value)
收件人:
K.set_value(beta, value)
这样,beta
中的实际值将被就地"更新,而不是创建新变量,并且损失将被正确地重新计算.
This way the actual value in beta
gets updated "in place" rather than creating a new variable, and the loss will be properly re-calculated.
这篇关于可变自动编码器:在Keras中进行预热的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!