可变自动编码器:在Keras中进行预热 [英] Variationnal auto-encoder: implementing warm-up in Keras

查看:55
本文介绍了可变自动编码器:在Keras中进行预热的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近阅读了这篇论文,其中介绍了一种称为预热"(WU)的过程,其中包括将KL散度中的损失乘以一个变量,该变量的值取决于历元数(从0到1线性变化)

I recently read this paper which introduces a process called "Warm-Up" (WU), which consists in multiplying the loss in the KL-divergence by a variable whose value depends on the number of epoch (it evolves linearly from 0 to 1)

我想知道这是否是这样做的好方法:

I was wondering if this is the good way to do that:

beta = K.variable(value=0.0)

def vae_loss(x, x_decoded_mean):
    # cross entropy
    xent_loss = K.mean(objectives.categorical_crossentropy(x, x_decoded_mean))

    # kl divergence
    for k in range(n_sample):
        epsilon = K.random_normal(shape=(batch_size, latent_dim), mean=0.,
                              std=1.0)  # used for every z_i sampling
        # Sample several layers of latent variables
        for mean, var in zip(means, variances):
            z_ = mean + K.exp(K.log(var) / 2) * epsilon

            # build z
            try:
                z = tf.concat([z, z_], -1)
            except NameError:
                z = z_
            except TypeError:
                z = z_

            # sum loss (using a MC approximation)
            try:
                loss += K.sum(log_normal2(z_, mean, K.log(var)), -1)
            except NameError:
                loss = K.sum(log_normal2(z_, mean, K.log(var)), -1)
        print("z", z)
        loss -= K.sum(log_stdnormal(z) , -1)
        z = None
    kl_loss = loss / n_sample
    print('kl loss:', kl_loss)

    # result
    result = beta*kl_loss + xent_loss
    return result

# define callback to change the value of beta at each epoch
def warmup(epoch):
    value = (epoch/10.0) * (epoch <= 10.0) + 1.0 * (epoch > 10.0)
    print("beta:", value)
    beta = K.variable(value=value)

from keras.callbacks import LambdaCallback
wu_cb = LambdaCallback(on_epoch_end=lambda epoch, log: warmup(epoch))


# train model
vae.fit(
    padded_X_train[:last_train,:,:],
    padded_X_train[:last_train,:,:],
    batch_size=batch_size,
    nb_epoch=nb_epoch,
    verbose=0,
    callbacks=[tb, wu_cb],
    validation_data=(padded_X_test[:last_test,:,:], padded_X_test[:last_test,:,:])
)

推荐答案

这将不起作用.我对其进行了测试,以弄清其为何不起作用.要记住的关键是Keras在训练开始时就创建了一个静态图.

This will not work. I tested it to figure out exactly why it was not working. The key thing to remember is that Keras creates a static graph once at the beginning of training.

因此,vae_loss函数仅被调用一次以创建损耗张量,这意味着每次计算损耗时,对beta变量的引用将保持不变.但是,您的warmup函数将beta重新分配给新的K.variable.因此,用于计算损失的beta与要更新的beta是不同的beta,该值将始终为0.

Therefore, the vae_loss function is called only once to create the loss tensor, which means that the reference to the beta variable will remain the same every time the loss is calculated. However, your warmup function reassigns beta to a new K.variable. Thus, the beta that is used for calculating loss is a different beta than the one that gets updated, and the value will always be 0.

这是一个简单的解决方法.只需在您的warmup回调中更改此行:

It is an easy fix. Just change this line in your warmup callback:

beta = K.variable(value=value)

收件人:

K.set_value(beta, value)

这样,beta中的实际值将被就地"更新,而不是创建新变量,并且损失将被正确地重新计算.

This way the actual value in beta gets updated "in place" rather than creating a new variable, and the loss will be properly re-calculated.

这篇关于可变自动编码器:在Keras中进行预热的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆