在不使用嵌入的情况下掩盖keras中LSTM中的零输入 [英] Masking zero inputs in LSTM in keras without using embedding

查看:243
本文介绍了在不使用嵌入的情况下掩盖keras中LSTM中的零输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Keras培训LSTM:

I am training an LSTM in Keras:

iclf = Sequential()
iclf.add(Bidirectional(LSTM(units=10, return_sequences=True, recurrent_dropout=0.3), input_shape=(None,2048)))
iclf.add(TimeDistributed(Dense(1, activation='sigmoid')))

每个单元格的输入是一个2048矢量,它是已知的并且不需要学习(如果您愿意,它们是输入句子中单词的ELMo嵌入).因此,这里没有嵌入层.

The input to each cell is a 2048 vector which is known and need not to be learned (if you will, they are the ELMo embeddings of the words in the input sentences). Therefore, here I have not the Embedding layer.

由于输入序列的长度可变,因此可以使用pad_sequences进行填充:

Since the input sequences have variable lengths, they are padded using pad_sequences:

X = pad_sequences(sequences=X, padding='post', truncating='post', value=0.0, dtype='float32')

现在,我想告诉LSTM忽略这些填充元素.官方方法是将嵌入层与mask_zero=True一起使用.但是,这里没有嵌入层.如何通知LSTM屏蔽零元素?

Now, I want to tell the LSTM to ignore these padded elements. The official way is to use the Embedding layer with mask_zero=True. But, here there is no Embedding layer. How can I inform the LSTM to mask zero elements?

推荐答案

正如@Today在注释中建议的那样,您可以使用Masking层.在这里,我添加了一个玩具问题.

As @Today has suggested in the comment you can use the Masking layer. Here I added a toy problem.

# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM, Masking
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model
from keras.preprocessing.sequence import pad_sequences


# define input sequence
sequence = array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], 
                  [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
                  [0.3, 0.4, 0.5, 0.6]])
# make sure to use dtype='float32' in padding otherwise with floating points
sequence = pad_sequences(sequence, padding='post', dtype='float32')


# reshape input into [samples, timesteps, features]
n_obs = len(sequence)
n_in = 9
sequence = sequence.reshape((n_obs, n_in, 1))

# define model
model = Sequential()
model.add(Masking(mask_value=0, input_shape=(n_in, 1)))
model.add(LSTM(100, activation='relu', input_shape=(n_in,1) ))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)
plot_model(model, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = model.predict(sequence, verbose=0)
print(yhat[0,:,0])

这篇关于在不使用嵌入的情况下掩盖keras中LSTM中的零输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆