在Keras中实现LSTM架构? [英] LSTM architecture in Keras implementation?

查看:185
本文介绍了在Keras中实现LSTM架构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Keras的新手,并在Keras documentation中详细介绍了LSTM及其实现细节.一切顺利,但突然间我遇到了此SO 发布和评论.它使我对真正的LSTM体系结构感到困惑:

I am new to Keras and going through the LSTM and its implementation details in Keras documentation. It was going easy but suddenly I came through this SO post and the comment. It has confused me on what is the actual LSTM architecture:

这是代码:

model = Sequential()
model.add(LSTM(32, input_shape=(10, 64)))
model.add(Dense(2))

据我的理解,10表示不.时间步长,并将每个时间步长分别馈入各自的LSTM cell; 64表示编号.每个时间步的功能.

As per my understanding, 10 denote the no. of time-steps and each one of them is fed to their respective LSTM cell; 64 denote the no. of features for each time-step.

但是,以上文章中的评论和实际答案使我对32的含义感到困惑.

But, the comment in the above post and the actual answer has confused me about the meaning of 32.

此外,LSTM的输出如何连接到Dense层.

Also, how is the output from LSTM is getting connected to the Dense layer.

手绘图解说明对于可视化体系结构非常有帮助.

A hand-drawn diagrammatic explanation would be quite helpful in visualizing the architecture.

编辑:

As far as this another SO post is concerned, then it means 32 represents the length of the output vector that is produced by each of the LSTM cells if return_sequences=True.

如果是这样,那么我们如何将10个LSTM单元中的每个单元产生的32维输出连接到下一个密集层?

If that's true then how do we connect each of 32-dimensional output produced by each of the 10 LSTM cells to the next dense layer?

还请告诉我第一个SO帖子答案是否模糊?

Also, kindly tell if the first SO post answer is ambiguous or not?

推荐答案

我们如何连接由每个产生的32维输出 10个LSTM单元到下一个致密层?

how do we connect each of 32-dimensional output produced by each of the 10 LSTM cells to the next dense layer?

这取决于您要如何做.假设您有:

It depends on how you want to do it. Suppose you have:

model.add(LSTM(32, input_shape=(10, 64), return_sequences=True))

然后,该层的输出具有形状(10, 32).此时,您可以使用Flatten层来获取具有320个分量的单个矢量,或者使用TimeDistributed来独立地处理每个10矢量:

Then, the output of that layer has shape (10, 32). At this point, you can either use a Flatten layer to get a single vector with 320 components, or use a TimeDistributed to work on each of the 10 vectors independently:

model.add(TimeDistributed(Dense(15))

此层的输出形状为(10, 15),并且相同权重应用于每个LSTM单位的输出.

The output shape of this layer is (10, 15), and the same weights are applied to the output of every LSTM unit.

很容易弄清编号.输入所需的LSTM单元数(按时间跨度指定)

it's easy to figure out the no. of LSTM cells required for the input(specified in timespan)

如何弄清编号.输出中需要多少LSTM单位?

How to figure out the no. of LSTM units required in the output?

您可以获取 last LSTM单元格的输出(最后一个时间步),也可以获取每个LSTM单元格的输出,具体取决于return_sequences的值.至于输出向量的维数,这只是您必须做出的选择,就像密集层的大小或转换层中的过滤器数量一样.

You either get the output of the last LSTM cell (last timestep) or the output of every LSTM cell, depending on the value of return_sequences. As for the dimensionality of the output vector, that's just a choice you have to make, just like the size of a dense layer, or number of filters in a conv layer.

10个LSTM单元中的每个32维矢量如何连接到TimeDistributed层?

how each of the 32-dim vector from the 10 LSTM cells get connected to TimeDistributed layer?

按照前面的示例,您将具有一个(10, 32)张量,即10个LSTM单元中每个单元的size-32向量. TimeDistributed(Dense(15))的作用是创建一个(15, 32)权重矩阵和大小为15的偏差矢量,然后执行以下操作:

Following the previous example, you would have a (10, 32) tensor, i.e. a size-32 vector for each of the 10 LSTM cells. What TimeDistributed(Dense(15)) does, is to create a (15, 32) weight matrix and a bias vector of size 15, and do:

for h_t in lstm_outputs:
    dense_outputs.append(
        activation(dense_weights.dot(h_t) + dense_bias)
    )

因此,dense_outputs的大小为(10, 15),并且对每个LSTM输出均施加相同的权重,独立.

Hence, dense_outputs has size (10, 15), and the same weights were applied to every LSTM output, independently.

请注意,当您不知道所需的时间步数(例如,用于机器翻译.在这种情况下,您将None用作时间步;我写的所有内容仍然适用,唯一的区别是时间步数不再固定. Keras将根据需要重复LSTM,TimeDistributed等次数(取决于输入).

Note that everything still works when you don't know how many timesteps you need, e.g. for machine translation. In this case, you use None for the timestep; everything that I wrote still applies, with the only difference that the number of timesteps is not fixed anymore. Keras will repeat LSTM, TimeDistributed, etc. for as many times as necessary (which depend on the input).

这篇关于在Keras中实现LSTM架构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆