变分自动编码器:在 Keras 中实现预热 [英] Variationnal auto-encoder: implementing warm-up in Keras

查看:40
本文介绍了变分自动编码器:在 Keras 中实现预热的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近阅读了这篇论文,其中介绍了一个称为预热"(WU) 的过程,这包括将 KL 散度中的损失乘以一个变量,该变量的值取决于 epoch 的数量(它从 0 到 1 线性演化)

I recently read this paper which introduces a process called "Warm-Up" (WU), which consists in multiplying the loss in the KL-divergence by a variable whose value depends on the number of epoch (it evolves linearly from 0 to 1)

我想知道这是不是这样做的好方法:

I was wondering if this is the good way to do that:

beta = K.variable(value=0.0)

def vae_loss(x, x_decoded_mean):
    # cross entropy
    xent_loss = K.mean(objectives.categorical_crossentropy(x, x_decoded_mean))

    # kl divergence
    for k in range(n_sample):
        epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.,
                              std=1.0)  # used for every z_i sampling
        # Sample several layers of latent variables
        for mean, var in zip(means, variances):
            z_ = mean + K.exp(K.log(var) / 2) * epsilon

            # build z
            try:
                z = tf.concat([z, z_], -1)
            except NameError:
                z = z_
            except TypeError:
                z = z_

            # sum loss (using a MC approximation)
            try:
                loss += K.sum(log_normal2(z_, mean, K.log(var)), -1)
            except NameError:
                loss = K.sum(log_normal2(z_, mean, K.log(var)), -1)
        print("z", z)
        loss -= K.sum(log_stdnormal(z) , -1)
        z = None
    kl_loss = loss / n_sample
    print('kl loss:', kl_loss)

    # result
    result = beta*kl_loss + xent_loss
    return result

# define callback to change the value of beta at each epoch
def warmup(epoch):
    value = (epoch/10.0) * (epoch <= 10.0) + 1.0 * (epoch > 10.0)
    print("beta:", value)
    beta = K.variable(value=value)

from keras.callbacks import LambdaCallback
wu_cb = LambdaCallback(on_epoch_end=lambda epoch, log: warmup(epoch))


# train model
vae.fit(
    padded_X_train[:last_train,:,:],
    padded_X_train[:last_train,:,:],
    batch_size=batch_size,
    nb_epoch=nb_epoch,
    verbose=0,
    callbacks=[tb, wu_cb],
    validation_data=(padded_X_test[:last_test,:,:], padded_X_test[:last_test,:,:])
)

推荐答案

这将不起作用.我对其进行了测试,以找出它无法正常工作的确切原因.要记住的关键是 Keras 在训练开始时创建了一次静态图.

This will not work. I tested it to figure out exactly why it was not working. The key thing to remember is that Keras creates a static graph once at the beginning of training.

因此,vae_loss 函数只被调用一次以创建损失张量,这意味着每次损失时对 beta 变量的引用将保持不变计算.但是,您的 warmup 函数会将 beta 重新分配给新的 K.variable.因此,用于计算损失的 beta 与更新后的 beta 不同,其值将始终为 0.

Therefore, the vae_loss function is called only once to create the loss tensor, which means that the reference to the beta variable will remain the same every time the loss is calculated. However, your warmup function reassigns beta to a new K.variable. Thus, the beta that is used for calculating loss is a different beta than the one that gets updated, and the value will always be 0.

这是一个简单的修复.只需在您的 warmup 回调中更改这一行:

It is an easy fix. Just change this line in your warmup callback:

beta = K.variable(value=value)

到:

K.set_value(beta, value)

通过这种方式,beta 中的实际值会就地"更新,而不是创建一个新变量,并且损失将被正确地重新计算.

This way the actual value in beta gets updated "in place" rather than creating a new variable, and the loss will be properly re-calculated.

这篇关于变分自动编码器:在 Keras 中实现预热的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆