训练Keras有状态LSTM return_seq = true不学习 [英] Train Keras Stateful LSTM return_seq=true not learning

查看:118
本文介绍了训练Keras有状态LSTM return_seq = true不学习的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
import numpy as np
import matplotlib.pyplot as plt


max = 30
step = 0.5
n_steps = int(30/0.5)

x = np.arange(0,max,step)
x = np.cos(x)*(max-x)/max

y = np.roll(x,-1)
y[-1] = x[-1]

shape = (n_steps,1,1)
batch_shape = (1,1,1)

x = x.reshape(shape)
y = y.reshape(shape)

model = Sequential()
model.add(LSTM(50, return_sequences=True, stateful=True, batch_input_shape=batch_shape))
model.add(LSTM(50, return_sequences=True, stateful=True))

model.add(Dense(1))

model.compile(loss='mse', optimizer='rmsprop')

for i in range(1000):
    model.reset_states()
    model.fit(x,y,nb_epoch=1, batch_size=1)
    p = model.predict(x, batch_size=1)
    plt.clf()
    plt.axis([-1,31, -1.1, 1.1])
    plt.plot(x[:, 0, 0], '*')
    plt.plot(y[:,0,0],'o')
    plt.plot(p[:,0,0],'.')
    plt.draw()
    plt.pause(0.001)


如keras API中所述 https://keras.io/layers/recurrent/

批次中索引i处每个样本的最后状态将用作 下一批索引i的样本的初始状态

the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch

因此,我正在使用batch_size = 1,并且试图在每个时间步长的衰减cos函数中预测下一个值.预测或下面图片中的红点应进入绿色圆圈,脚本才能正确预测该预测,但是该预测不能收敛...有任何想法可以使其学习吗?

So I'm using batch_size = 1 and I'm trying to predict the next value in the decaying cos-function for each timestep. The prediction, or the red dots in the picture below should go into the green circles for the script to predict it correctly, however it doesn't converge... Have any idea to make it learn?

推荐答案

该问题在于分别针对每个时期调用model.fit.在这种情况下,optimizer参数将被重置,这对培训过程有害.另一件事是在预测之前也调用了reset_states-好像没有调用过-fit中的states是预测的开始状态,这也可能是有害的.最终代码如下:

The problem lied in a calling model.fit for each epoch separately. In this case optimizer parameters are reset what was harmful for a training process. Other thing is calling reset_states also before prediction - as if it wasn't called - the states from fit are starting states for prediction what also might be harmful. The final code is following:

for epoch in range(1000):
    model.reset_states()
    tot_loss = 0
    for batch in range(n_steps):
        batch_loss = model.train_on_batch(x[batch:batch+1], y[batch:batch+1])
        tot_loss+=batch_loss

    print "Loss: " + str(tot_loss/float(n_steps))
    model.reset_states()
    p = model.predict(x, batch_size=1)

这篇关于训练Keras有状态LSTM return_seq = true不学习的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆