有状态的LSTM:何时重置状态? [英] Stateful LSTM: When to reset states?

查看:191
本文介绍了有状态的LSTM:何时重置状态?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出尺寸为(m个样本,n个序列和k个特征)的 X 和尺寸为(m样本,0/1):

假设我想训练一个有状态的LSTM(按照keras的定义,其中有状态= True"表示每个样本序列之间的单元状态都不会重设-如果我错了,请纠正我!)可以按每个时期或每个样本重新设置?

Suppose I want to train a stateful LSTM (going by keras definition, where "stateful = True" means that cell states are not reset between sequences per sample -- please correct me if I'm wrong!), are states supposed to be reset on a per epoch basis or per sample basis?

示例:

for e in epoch:
    for m in X.shape[0]:          #for each sample
        for n in X.shape[1]:      #for each sequence
            #train_on_batch for model...
            #model.reset_states()  (1) I believe this is 'stateful = False'?
        #model.reset_states()      (2) wouldn't this make more sense?
    #model.reset_states()          (3) This is what I usually see...

总而言之,我不确定在每个序列或每个纪元之后(在X训练完所有m个样本之后)是否要重置状态.

In summary, I am not sure if to reset states after each sequence or each epoch (after all m samples are trained in X).

非常感谢您的建议.

推荐答案

如果使用stateful=True,通常会在每个时期或每两个样本结束时重置状态.如果要在每次采样后重置状态,则相当于仅使用stateful=False.

If you use stateful=True, you would typically reset the state at the end of each epoch, or every couple of samples. If you want to reset the state after each sample, then this would be equivalent to just using stateful=False.

关于您提供的循环:

for e in epoch:
    for m in X.shape[0]:          #for each sample
        for n in X.shape[1]:      #for each sequence

请注意X的尺寸不完全

 (m samples, n sequences, k features)

尺寸实际上是

(batch size, number of timesteps, number of features)

因此,您不应该具有内循环:

Hence, you are not supposed to have the inner loop:

for n in X.shape[1]

现在,关于循环

for m in X.shape[0]

由于分批枚举是在keras中自动完成的,因此您也不必实施此循环(除非您希望每两个样本重置状态一次).因此,如果您只想在每个时期结束时进行重置,则只需要外部循环即可.

since the enumeration over batches is done in keras automatically, you don't have to implement this loop as well (unless you want to reset the states every couple of samples). So if you want to reset only at the end of each epoch, you need only the external loop.

以下是这种架构的示例(摘自 this博客文章):

Here is an example of such architecture (taken from this blog post):

batch_size = 1
model = Sequential()
model.add(LSTM(16, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
for i in range(300):
    model.fit(X, y, epochs=1, batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()

这篇关于有状态的LSTM:何时重置状态?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆