在 Keras 中,当我创建一个带有 N 个“单元"的有状态的“LSTM"层时,我究竟要配置什么? [英] In Keras, what exactly am I configuring when I create a stateful `LSTM` layer with N `units`?

查看:18
本文介绍了在 Keras 中,当我创建一个带有 N 个“单元"的有状态的“LSTM"层时,我究竟要配置什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

普通 Dense 层中的第一个参数也是 units,并且是该层中神经元/节点的数量.然而,标准的 LSTM 单元如下所示:

The first arguments in a normal Dense layer is also units, and is the number of neurons/nodes in that layer. A standard LSTM unit however looks like the following:

(这是理解 LSTM 网络")

在 Keras 中,当我创建一个像这样的 LSTM 对象 LSTM(units=N, ...) 时,我实际上创建了这些 LSTM 单元的 N 吗?或者它是 LSTM 单元内神经网络"层的大小,即公式中的 W ?还是别的什么?

In Keras, when I create an LSTM object like this LSTM(units=N, ...), am I actually creating N of these LSTM units? Or is it the size of the "Neural Network" layers inside the LSTM unit, i.e., the W's in the formulas? Or is it something else?

对于上下文,我基于 这个示例代码.

For context, I'm working based on this example code.

以下是文档:https://keras.io/layers/recurrent/

它说:

units:正整数,输出空间的维数.

units: Positive integer, dimensionality of the output space.

这让我觉得它是来自 Keras LSTM层"对象的输出数量.这意味着下一层将有 N 个输入.这是否意味着 LSTM 层中确实存在 N 个这些 LSTM 单元,或者可能恰好 一个 LSTM 单元运行 N 次迭代输出 N 个这些 h[t] 值,例如,从 h[tN]h[t]?

It makes me think it is the number of outputs from the Keras LSTM "layer" object. Meaning the next layer will have N inputs. Does that mean there actually exists N of these LSTM units in the LSTM layer, or maybe that that exactly one LSTM unit is run for N iterations outputting N of these h[t] values, from, say, h[t-N] up to h[t]?

如果它只定义了输出的数量,这是否意味着输入仍然可以是一个,还是我们必须手动创建滞后输入变量x[tN]x[t]units=N 参数定义的每个 LSTM 单元一个?

If it only defines the number of outputs, does that mean the input still can be, say, just one, or do we have to manually create lagging input variables x[t-N] to x[t], one for each LSTM unit defined by the units=N argument?

在我写这篇文章的时候,我想到了参数 return_sequences 的作用.如果设置为True,则所有N 输出都将向前传递到下一层,而如果设置为False,则只传递最后一个h[t] 输出到下一层.我说得对吗?

As I'm writing this it occurs to me what the argument return_sequences does. If set to True all the N outputs are passed forward to the next layer, while if it is set to False it only passes the last h[t] output to the next layer. Am I right?

推荐答案

你可以查看这个问题 了解更多信息,尽管它基于 Keras-1.x API.

You can check this question for further information, although it is based on Keras-1.x API.

基本上,unit 表示 LSTM 中内部单元的维度.因为在 LSTM 中,内部单元的维度(图中的 C_t 和 C_{t-1})、输出掩码(图中的 o_t)和隐藏/输出状态(图中的 h_t)应该具有 SAME 维度,因此您输出的维度也应该是 unit-length.

Basically, the unit means the dimension of the inner cells in LSTM. Because in LSTM, the dimension of inner cell (C_t and C_{t-1} in the graph), output mask (o_t in the graph) and hidden/output state (h_t in the graph) should have the SAME dimension, therefore you output's dimension should be unit-length as well.

LSTM在Keras中只定义了一个LSTM块,其单元的长度为unit.如果你设置 return_sequence=True,它会返回一些形状:(batch_size, timespan, unit).如果 false,那么它只返回形状 (batch_size, unit) 的最后一个输出.

And LSTM in Keras only define exactly one LSTM block, whose cells is of unit-length. If you set return_sequence=True, it will return something with shape: (batch_size, timespan, unit). If false, then it just return the last output in shape (batch_size, unit).

至于输入,您应该为每个时间戳提供输入.基本上,形状类似于 (batch_size, timespan, input_dim),其中 input_dim 可以与 unit 不同.如果您只想在第一步提供输入,您可以在其他时间步骤简单地用零填充数据.

As for the input, you should provide input for every timestamp. Basically, the shape is like (batch_size, timespan, input_dim), where input_dim can be different from the unit. If you just want to provide input at the first step, you can simply pad your data with zeros at other time steps.

这篇关于在 Keras 中,当我创建一个带有 N 个“单元"的有状态的“LSTM"层时,我究竟要配置什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆