如果批量归一化是模型的一部分,如何在张量流中对LSTM应用Monte Carlo Dropout? [英] How to apply Monte Carlo Dropout, in tensorflow, for an LSTM if batch normalization is part of the model?
问题描述
def build_uncomplied_model(hparams):输入= tf.keras.Input(形状=(无,hparams ["n_features"]))x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_1"],return_sequences = True,recurrent_dropout = hparams ['dropout'])(输入)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_2"],return_sequences = True)(x)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_3"],return_sequences = True)(x)x =层.BatchNormalization()(x)输出= layers.TimeDistributed(layers.Dense(hparams ["n_features"]))(x)模型= tf.keras.Model(输入,输出,名称= RNN_type +"_model")退货模式
现在我知道要应用MCDropout,我们可以应用以下代码:
y_predict = np.stack([x范围(100)中的[my_model(X_test,training = True))]y_proba = y_predict.mean(轴= 0)
但是,设置 training = True
将强制批处理规范层过度适合测试数据集.
另外,在我的情况下,将训练设置为True时构建自定义的Dropout层不是解决方案,因为我使用的是LSTM.
MCDropout类(tf.keras.layers.Dropout):def调用(自身,输入):返回super().call(inputs,training = True)
非常感谢您的帮助!
可能的解决方案是创建自定义LSTM层.您应该重写call方法,以将训练标记强制设置为True
类MCLSTM(keras.layers.LSTM):def __init __(self,units,** kwargs):超级(MCLSTM,自我).__init__(单位,**kwargs)def调用(自身,输入,掩码=无,训练=无,initial_state =无):返回super(MCLSTM,self).call(输入,遮罩=遮罩训练=正确,initial_state = initial_state,)
然后您可以在代码中使用它
def build_uncomplied_model(hparams):输入= tf.keras.Input(形状=(无,hparams ["n_features"]))x = MCLSTM(hparams ["cell_size_1"],return_sequences = True,recurrent_dropout = hparams ['dropout'])(输入)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_2"],return_sequences = True)(x)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_3"],return_sequences = True)(x)x =层.BatchNormalization()(x)输出= layers.TimeDistributed(layers.Dense(hparams ["n_features"]))(x)模型= tf.keras.Model(输入,输出,名称= RNN_type +"_model")退货模式
或将其添加到您的 return_RNN
工厂(一种更为优雅的方式)
=====编辑=====
另一种解决方案是在创建模型时添加训练标志.像这样:
def build_uncomplied_model(hparams):输入= tf.keras.Input(形状=(无,hparams ["n_features"]))#这是蒙特卡洛LSTMx = LSTM(hparams ["cell_size_1"],return_sequences = True,recurrent_dropout = hparams ['dropout'])(输入,训练= True)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_2"],return_sequences = True)(x)x = return_RNN(hparams ["rnn_type"])(hparams ["cell_size_3"],return_sequences = True)(x)x =层.BatchNormalization()(x)输出= layers.TimeDistributed(layers.Dense(hparams ["n_features"]))(x)模型= tf.keras.Model(输入,输出,名称= RNN_type +"_model")退货模式
I have a model composed of 3 LSTM layers followed by a batch norm layer and finally dense layer. Here is the code:
def build_uncomplied_model(hparams):
inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
x = return_RNN(hparams["rnn_type"])(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs)
x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
x = layers.BatchNormalization()(x)
outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)
model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
return model
Now I am aware that to apply MCDropout, we can apply the following code:
y_predict = np.stack([my_model(X_test, training=True) for x in range(100)])
y_proba = y_predict.mean(axis=0)
However, setting training = True
will force the batch norm layer to overfit the testing dataset.
Additionally, building a custom Dropout layer while setting training to True isn't a solution in my case because I am using LSTM.
class MCDropout(tf.keras.layers.Dropout):
def call(self, inputs):
return super().call(inputs, training=True)
Any help is much appreciated!!
A possible solution could be to create a custom LSTM layer. You should override the call method to force the training flag to be True
class MCLSTM(keras.layers.LSTM):
def __init__(self, units, **kwargs):
super(MCLSTM, self).__init__(units, **kwargs)
def call(self, inputs, mask=None, training=None, initial_state=None):
return super(MCLSTM, self).call(
inputs,
mask=mask,
training=True,
initial_state=initial_state,
)
Then you can use it in your code
def build_uncomplied_model(hparams):
inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
x = MCLSTM(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs)
x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
x = layers.BatchNormalization()(x)
outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)
model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
return model
or add it to your return_RNN
factory (a more elegant way)
===== EDIT =====
Another solution could be to add the training flag when creating the model. Something like this:
def build_uncomplied_model(hparams):
inputs = tf.keras.Input(shape=(None, hparams["n_features"]))
# This the Monte Carlo LSTM
x = LSTM(hparams["cell_size_1"], return_sequences=True, recurrent_dropout=hparams['dropout'])(inputs, training=True)
x = return_RNN(hparams["rnn_type"])(hparams["cell_size_2"], return_sequences=True)(x)
x = return_RNN(hparams["rnn_type"])(hparams["cell_size_3"], return_sequences=True)(x)
x = layers.BatchNormalization()(x)
outputs = layers.TimeDistributed(layers.Dense(hparams["n_features"]))(x)
model = tf.keras.Model(inputs, outputs, name=RNN_type + "_model")
return model
这篇关于如果批量归一化是模型的一部分,如何在张量流中对LSTM应用Monte Carlo Dropout?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!