堆叠LSTM网络中每个LSTM层的输入是什么? [英] What's the input of each LSTM layer in a stacked LSTM network?

查看：831 发布时间：2020/7/9 21:55:46 deep-learning lstm recurrent-neural-network stacked

本文介绍了堆叠LSTM网络中每个LSTM层的输入是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我很难理解堆叠LSTM网络中各层的输入输出流.假设我已经创建了一个堆叠的LSTM网络，如下所示:

I'm having some difficulty understanding the input-output flow of layers in stacked LSTM networks. Let's say i have created a stacked LSTM network like the one below:

# parameters
time_steps = 10
features = 2
input_shape = [time_steps, features]
batch_size = 32

# model
model = Sequential()
model.add(LSTM(64, input_shape=input_shape,  return_sequences=True))
model.add(LSTM(32,input_shape=input_shape))

我们的堆叠LSTM网络由2个LSTM层组成，分别具有64个和32个隐藏单元.在这种情况下，我们希望在每个时间步长，第一LSTM层-LSTM(64)-将作为输入传递给第二LSTM层-LSTM(32)-一个大小为[batch_size, time-step, hidden_unit_length]的向量，该向量表示隐藏的当前时间步的第一个LSTM层的状态.令我感到困惑的是:

where our stacked-LSTM network consists of 2 LSTM layers with 64 and 32 hidden units respectively. In this scenario, we expect that at each time-step the 1st LSTM layer -LSTM(64)- will pass as input to the 2nd LSTM layer -LSTM(32)- a vector of size [batch_size, time-step, hidden_unit_length], which would represent the hidden state of the 1st LSTM layer at the current time-step. What confuses me is:

第二层LSTM层-LSTM(32)-是否将大小为[batch_size, time-step, hidden_unit_length]的第一层-LSTM(64)-的隐藏状态作为X(t)接收(作为输入)，并使其通过其自身的隐藏层网络-在这种情况下，由32个节点组成??
如果第一个为真，当第二个仅处理第一层的隐藏状态时，为什么第一个-LSTM(64)-和第二个-LSTM(32)-的input_shape相同?在我们的情况下，不应将input_shape设置为[32, 10, 64]吗?

Does the 2nd LSTM layer -LSTM(32)- receives as X(t) (as input) the hidden state of the 1st layer -LSTM(64)- that has the size [batch_size, time-step, hidden_unit_length] and passes it through it's own hidden network - in this case consisting of 32 nodes-?
If the first is true, why the input_shape of the 1st -LSTM(64)- and 2nd -LSTM(32)- is the same, when the 2nd only processes the hidden state of the 1st layer? Shouldn't in our case have input_shape set to be [32, 10, 64]?

我发现下面的LSTM可视化非常有用(在此处 )，但在堆叠式lstm网络上不会扩展:

I found the LSTM visualization below very helpful (found here) but it doesn't expand on stacked-lstm networks:

任何帮助将不胜感激. 谢谢！

Any help would be highly appreciated. Thanks!

推荐答案

input_shape仅对于第一层是必需的.后续层将上一层的输出作为输入(因此，它们的input_shape参数值将被忽略)

The input_shape is only required for the first layer. The subsequent layers take the output of previous layer as its input (as so their input_shape argument value is ignored)

下面的模型

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32))

代表以下架构

您可以从model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_26 (LSTM)               (None, 5, 64)             17152     
_________________________________________________________________
lstm_27 (LSTM)               (None, 32)                12416     
=================================================================

替换行

model.add(LSTM(32))

与

model.add(LSTM(32, input_shape=(1000000, 200000)))

仍会为您提供相同的体系结构(使用model.summary()进行验证)，因为input_shape被忽略，因为它将上一层的张量输出作为输入.

will still give you the same architecture (verify using model.summary()) because the input_shape is ignore as it takes as input the tensor output of the previous layer.

如果您需要一个序列来像下面的序列进行构架

And If you need a sequence to sequence architecture like below

您应该使用以下代码:

model = Sequential()
model.add(LSTM(64, return_sequences=True, input_shape=(5, 2)))
model.add(LSTM(32, return_sequences=True))

应该返回模型

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_32 (LSTM)               (None, 5, 64)             17152     
_________________________________________________________________
lstm_33 (LSTM)               (None, 5, 32)             12416     
=================================================================

这篇关于堆叠LSTM网络中每个LSTM层的输入是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

堆叠LSTM网络中每个LSTM层的输入是什么? [英] What's the input of each LSTM layer in a stacked LSTM network?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

堆叠LSTM网络中每个LSTM层的输入是什么? [英] What&#39;s the input of each LSTM layer in a stacked LSTM network?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

堆叠LSTM网络中每个LSTM层的输入是什么? [英] What's the input of each LSTM layer in a stacked LSTM network?

登录关闭