在Keras中构建具有嵌入层的LSTM网络 [英] Building an LSTM net with an embedding layer in Keras

查看:89
本文介绍了在Keras中构建具有嵌入层的LSTM网络的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个Keras模型,该模型包括一个嵌入层,然后是两个LSTM,它们的落差为0.5,最后是一个具有softmax激活的致密层.

I want to create a Keras model consisting of an embedding layer, followed by two LSTMs with dropout 0.5, and lastly a dense layer with a softmax activation.

第一个LSTM应该将顺序输出传播到第二层,而在第二个中,我只对处理完整个序列后的LSTM隐藏状态感兴趣.

The first LSTM should propagate the sequential output to the second layer, while in the second I am only interested in getting the hidden state of the LSTM after processing the whole sequence.

我尝试了以下操作:

sentence_indices = Input(input_shape, dtype = 'int32')

embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)

embeddings = embedding_layer(sentence_indices)
# Propagate the embeddings through an LSTM layer with 128-dimensional hidden state
X = LSTM(128, return_sequences=True, dropout = 0.5)(embeddings)

# Propagate X trough another LSTM layer with 128-dimensional hidden state
X = LSTM(128, return_sequences=False, return_state=True, dropout = 0.5)(X)

# Propagate X through a Dense layer with softmax activation to get back a batch of 5-dimensional vectors.
X = Dense(5, activation='softmax')(X)

# Create Model instance which converts sentence_indices into X.
model = Model(inputs=[sentence_indices], outputs=[X])

但是我遇到以下错误:

ValueError: Layer dense_5 expects 1 inputs, but it received 3 input tensors. Input received: [<tf.Tensor 'lstm_10/TensorArrayReadV3:0' shape=(?, 128) dtype=float32>, <tf.Tensor 'lstm_10/while/Exit_2:0' shape=(?, 128) dtype=float32>, <tf.Tensor 'lstm_10/while/Exit_3:0' shape=(?, 128) dtype=float32>]

很明显,LSTM没有返回我期望的形状的输出.我该如何解决?

Clearly LSTM is not returning an output of the shape I expect. How do I fix this?

推荐答案

如果设置 return_state = True ,则 LSTM(...)(X)返回三东西:输出,最后的隐藏状态和最后的单元格状态.

If you set return_state=True, then LSTM(...)(X) returns three things: the outputs, the last hidden state and the last cell state.

因此而不是 X = LSTM(128,return_sequences = False,return_state = True,辍学= 0.5)(X),执行 X,h,c = LSTM(128,return_sequences= False,return_state = True,辍学= 0.5)(X)

请参见这里作为示例.

这篇关于在Keras中构建具有嵌入层的LSTM网络的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆