keras 何时重置 LSTM 状态? [英] When does keras reset an LSTM state?
问题描述
我阅读了有关它的各种文本,但似乎没有一个能回答这个非常基本的问题.总是模棱两可:
I read all sorts of texts about it, and none seem to answer this very basic question. It's always ambiguous:
在 stateful = False
LSTM 层中,keras 是否在以下时间重置状态:
In a stateful = False
LSTM layer, does keras reset states after:
- 每个序列;或
- 每批?
假设我有 X_train 形状为 (1000,20,1),意思是 1000 个序列,每个值的 20 步.如果我做:
Suppose I have X_train shaped as (1000,20,1), meaning 1000 sequences of 20 steps of a single value. If I make:
model.fit(X_train, y_train, batch_size=200, nb_epoch=15)
是否会为每个序列重置状态(重置状态 1000 次)?
还是会为每个批次重置状态(重置状态 5 次)?
Will it reset states for every single sequence (resets states 1000 times)?
Or will it reset states for every batch (resets states 5 times)?
推荐答案
通过一些测试,我得到了以下结论,这是根据文档和 Nassim 的回答:
Cheking with some tests, I got to the following conclusion, which is according to the documentation and to Nassim's answer:
首先,层中没有单个状态,而是批次中的每个样本有一个状态.在这样的层中有 batch_size
个并行状态.
First, there isn't a single state in a layer, but one state per sample in the batch. There are batch_size
parallel states in such a layer.
在stateful=False
的情况下,所有状态在每批之后一起重置.
一个带有
10 个序列
的批处理将创建10 个状态
,并且所有 10 个状态在处理后都会自动重置.
A batch with
10 sequences
would create10 states
, and all 10 states are resetted automatically after it's processed.
具有10个序列
的下一批将创建10个新状态
,在此批次处理后也会重置
The next batch with 10 sequences
will create 10 new states
, which will also be resetted after this batch is processed
如果所有这些序列的length (timesteps) = 7
,这两个batch的实际结果是:
If all those sequences have length (timesteps) = 7
, the practical result of these two batches is:
20 个单独的序列,每个序列的长度为 7
20 individual sequences, each with length 7
没有一个序列是相关的.但当然:权重(不是状态)对于该层将是唯一的,并且将代表该层从所有序列中学到的东西.
None of the sequences are related. But of course: the weights (not the states) will be unique for the layer, and will represent what the layer has learned from all the sequences.
- 状态是:我现在在序列中的什么位置?哪个时间步长?从开始到现在,这个特定的序列表现如何?
- 权重是:我对目前看到的所有序列的一般行为了解多少?
在这种情况下,也有相同数量的并行状态,但它们根本不会被重置.
In this case, there is also the same number of parallel states, but they will simply not be resetted at all.
具有
10 个序列
的批次将创建10 个状态
,这些状态将在批次结束时保持原样.
A batch with
10 sequences
will create10 states
that will remain as they are at the end of the batch.
具有10 个序列
的下一批(它必须是 10 个,因为第一个是 10 个)将重用相同的 10 个状态代码>之前创建的.
The next batch with 10 sequences
(it's required to be 10, since the first was 10) will reuse the same 10 states
that were created before.
实际的结果是:第二批的10个序列只是延续了第一批的10个序列,就好像完全没有中断一样.
The practical result is: the 10 sequences in the second batch are just continuing the 10 sequences of the first batch, as if there had been no interruption at all.
如果每个序列的length (timesteps) = 7
,那么实际的意思是:
If each sequence has length (timesteps) = 7
, then the actual meaning is:
10 个单独的序列,每个序列的长度为 14
10 individual sequences, each with length 14
当您看到达到序列的总长度时,您调用 model.reset_states()
,这意味着您将不再继续之前的序列,现在您将开始提供新的序列.
When you see that you reached the total length of the sequences, then you call model.reset_states()
, meaning you will not continue the previous sequences anymore, now you will start feeding new sequences.
这篇关于keras 何时重置 LSTM 状态?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!