如果批量归一化是模型的一部分,如何在张量流中对LSTM应用Monte Carlo Dropout? [英] How to apply Monte Carlo Dropout, in tensorflow, for an LSTM if batch normalization is part of the model?

查看:78
本文介绍了如果批量归一化是模型的一部分,如何在张量流中对LSTM应用Monte Carlo Dropout?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个模型,该模型由3个LSTM层,一个批处理规范层以及最后一个致密层组成.这是代码:

  def build_uncomplied_model(hparams):输入= tf.keras.Input(形状=(无,hparams ["n_features"]))x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_1"],return_sequences = True,recurrent_dropout = hparams ['dropout'])(输入)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_2"],return_sequences = True)(x)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_3"],return_sequences = True)(x)x =层.BatchNormalization()(x)输出= layers.TimeDistributed(layers.Dense(hparams ["n_features"]))(x)模型= tf.keras.Model(输入,输出,名称= RNN_type +"_model")退货模式 

现在我知道要应用MCDropout,我们可以应用以下代码:

  y_predict = np.stack([x范围(100)中的[my_model(X_test,training = True))]y_proba = y_predict.mean(轴= 0) 

但是,设置 training = True 将强制批处理规范层过度适合测试数据集.

另外,在我的情况下,将训练设置为True时构建自定义的Dropout层不是解决方案,因为我使用的是LSTM.

  MCDropout类(tf.keras.layers.Dropout):def调用(自身,输入):返回super().call(inputs,training = True) 

非常感谢您的帮助!

解决方案

可能的解决方案是创建自定义LSTM层.您应该重写call方法,以将训练标记强制设置为True

 类MCLSTM(keras.layers.LSTM):def __init __(self,units,** kwargs):超级(MCLSTM,自我).__init__(单位,**kwargs)def调用(自身,输入,掩码=无,训练=无,initial_state =无):返回super(MCLSTM,self).call(输入,遮罩=遮罩训练=正确,initial_state = initial_state,) 

然后您可以在代码中使用它

  def build_uncomplied_model(hparams):输入= tf.keras.Input(形状=(无,hparams ["n_features"]))x = MCLSTM(hparams ["cell_size_1"],return_sequences = True,recurrent_dropout = hparams ['dropout'])(输入)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_2"],return_sequences = True)(x)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_3"],return_sequences = True)(x)x =层.BatchNormalization()(x)输出= layers.TimeDistributed(layers.Dense(hparams ["n_features"]))(x)模型= tf.keras.Model(输入,输出,名称= RNN_type +"_model")退货模式 

或将其添加到您的 return_RNN 工厂(一种更为优雅的方式)

=====编辑=====

另一种解决方案是在创建模型时添加训练标志.像这样:

  def build_uncomplied_model(hparams):输入= tf.keras.Input(形状=(无,hparams ["n_features"]))#这是蒙特卡洛LSTMx = LSTM(hparams ["cell_size_1"],return_sequences = True,recurrent_dropout = hparams ['dropout'])(输入,训练= True)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_2"],return_sequences = True)(x)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_3"],return_sequences = True)(x)x =层.BatchNormalization()(x)输出= layers.TimeDistributed(layers.Dense(hparams ["n_features"]))(x)模型= tf.keras.Model(输入,输出,名称= RNN_type +"_model")退货模式 

I have a model composed of 3 LSTM layers followed by a batch norm layer and finally dense layer. Here is the code:

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

Now I am aware that to apply MCDropout, we can apply the following code:

y_predict = np.stack([my_model(X_test, training=True) for x in range(100)])
y_proba = y_predict.mean(axis=0)

However, setting training = True will force the batch norm layer to overfit the testing dataset.

Additionally, building a custom Dropout layer while setting training to True isn't a solution in my case because I am using LSTM.

class MCDropout(tf.keras.layers.Dropout):
    def call(self, inputs):
        return super().call(inputs, training=True)

Any help is much appreciated!!

解决方案

A possible solution could be to create a custom LSTM layer. You should override the call method to force the training flag to be True

class MCLSTM(keras.layers.LSTM):
    def __init__(self, units, **kwargs):
        super(MCLSTM, self).__init__(units, **kwargs)
    def call(self, inputs, mask=None, training=None, initial_state=None):
        return super(MCLSTM, self).call(
            inputs,
            mask=mask,
            training=True,
            initial_state=initial_state,
        )

Then you can use it in your code

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    x = MCLSTM(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

or add it to your return_RNN factory (a more elegant way)

===== EDIT =====

Another solution could be to add the training flag when creating the model. Something like this:

def build_uncomplied_model(hparams):
    inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
    # This the Monte Carlo LSTM
    x = LSTM(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs, training=True)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
    x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
    x = layers.BatchNormalization()(x)
    outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)

    model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
    return model

这篇关于如果批量归一化是模型的一部分,如何在张量流中对LSTM应用Monte Carlo Dropout?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆