Keras LSTM层实现背后的架构是什么? [英] What is the architecture behind the Keras LSTM Layer implementation?

查看:301
本文介绍了Keras LSTM层实现背后的架构是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Keras中,如何将输入尺寸转换为LSTM层的输出尺寸?通过阅读Colah的博客帖子,似乎"timesteps"(又称为input_diminput_shape中的第一个值)应等于神经元数,该神经元应等于此LSTM层的输出数(由LSTM自变量描述,用于LSTM层).

How does the input dimensions get converted to the output dimensions for the LSTM Layer in Keras? From reading Colah's blog post, it seems as though the number of "timesteps" (AKA the input_dim or the first value in the input_shape) should equal the number of neurons, which should equal the number of outputs from this LSTM layer (delineated by the units argument for the LSTM layer).

通过阅读这篇文章,我了解输入形状.我困惑的是Keras如何将输入插入每个LSTM智能神经元".

From reading this post, I understand the input shapes. What I am baffled by is how Keras plugs the inputs into each of the LSTM "smart neurons".

Keras LSTM参考

让我感到困惑的示例代码:

Example code that baffles me:

model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
model.add(Dense(2))

据此,我认为LSTM层有10个神经元,每个神经元都被馈给了一个长度为64的向量.但是,似乎它有32个神经元,我不知道每个神经元将被馈入什么.我知道,要使LSTM连接到Dense层,我们只需将所有32个输出插入2个神经元的每一个即可.令我感到困惑的是LSTM的InputLayer.

From this, I would think that the LSTM layer has 10 neurons and each neuron is fed a vector of length 64. However, it seems it has 32 neurons and I have no idea what is being fed into each. I understand that for the LSTM to connect to the Dense layer, we can just plug all 32 outputs to each of the 2 neurons. What confuses me is the InputLayer to the LSTM.

(类似的SO帖子,但我不太了解需要)

推荐答案

我是对的!该体系结构是10个神经元,每个神经元代表一个时间步长.每个神经元都被喂入一个64个长度的向量,该向量表示64个特征(input_dim).

I was correct! The architecture is 10 neurons, each representing a time-step. Each neuron is being fed a 64 length vector, representing 64 features (the input_dim).

32代表隐藏状态数或隐藏单位长度".它表示存在多少隐藏状态,还表示输出尺寸(因为我们在LSTM的末尾输出了隐藏状态).

The 32 represents the number of hidden states or the "hidden unit length". It represents how many hidden states there are and also represents the output dimension (since we output the hidden state at the end of the LSTM).

最后,来自最后一个时间步的32维输出向量随后被馈送到2个神经元的密集层,这基本上意味着将32个长度的向量插入两个神经元.

Lastly, the 32-dimensional output vector from the last time-step is then fed to a Dense layer of 2 neurons, which basically means plug the 32 length vector to both neurons.

更多阅读以及一些有用的答案:

More reading with somewhat helpful answers:

这篇关于Keras LSTM层实现背后的架构是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆