为什么用于预测的Keras LSTM批次大小必须与拟合批次大小相同? [英] Why does Keras LSTM batch size used for prediction have to be the same as fitting batch size?

查看:81
本文介绍了为什么用于预测的Keras LSTM批次大小必须与拟合批次大小相同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当使用Keras LSTM预测时间序列数据时,当我尝试使用50的批量大小训练模型时,却在尝试使用5,000的批量大小预测同一模型时遇到错误.1(即仅预测下一个值).

为什么我不能同时训练和拟合多个批次的模型,然后使用该模型预测除相同批次大小以外的任何东西.这似乎没有道理,但后来我很容易就此遗漏了一些东西.

这是模型. batch_size 是50, sl 是序列长度,当前设置为20.

 模型= Sequential()model.add(LSTM(1,batch_input_shape =(batch_size,1,sl),state = True))model.add(密集(1))model.compile(loss ='mean_squared_error',优化器='adam')model.fit(trainX,trainY,epochs = epochs,batch_size = batch_size,verbose = 2) 

这是用于预测RMSE训练集的行

 #进行预测trainPredict = model.predict(trainX,batch_size =批处理大小) 

这是看不见的时间步长的实际预测

适用于范围(test_len)中的i的

 :print('预测%s:'%str(pred_count))next_pred_res = np.reshape(next_pred,(next_pred.shape [1],1,next_pred.shape [0]))# 作出预测ForecastPredict = model.predict(next_pred_res,batch_size = 1)ForecastPredictInv = scaler.inverse_transform(forecastPredict)Forecast.append(forecastPredictInv)next_pred = next_pred [1:]next_pred = np.concatenate([next_pred,ForecastPredict])pred_count + = 1 

此问题与以下内容有关:

forecastPredict = model.predict(next_pred_res,batch_size = batch_size)

此处batch_size设置为1时的错误是:

ValueError:无法为张量为'(lstm_1_input:0')的形状为((10,1,2)'的张量'lstm_1_input:0'输入形状(1,1,2)的值与其他批次大小一样,此处的 batch_size 也设置为50时抛出.

总错误为:

  ForecastPredict = model.predict(next_pred_res,batch_size = 1)预测中的文件"/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/models.py",第899行返回self.model.predict(x,batch_size = batch_size,verbose = verbose)预测中的文件"/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/engine/training.py",行1573batch_size = batch_size,详细=详细)_predict_loop中的文件"/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/engine/training.py",行1203batch_outs = f(ins_batch)在__call__文件中的文件"/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py",第2103行feed_dict = feed_dict)运行中的文件"/home/entelechy/tf_keras/lib/python3.5/site-packages/tensorflow/python/client/session.py",第767行run_metadata_ptr)_run中的文件"/home/entelechy/tf_keras/lib/python3.5/site-packages/tensorflow/python/client/session.py",第944行%(np_val.shape,subfeed_t.name,str(subfeed_t.get_shape())))ValueError:无法为形状为((10,1,2)'的Tensor'lstm_1_input:0'输入形状(1,1,2)的值 

将模型设置为 stateful = False 后,便可以使用不同的批次大小进行拟合/训练和预测.是什么原因呢?

解决方案

不幸的是,对于Keras来说,您想做的事情是不可能的……我在这个问题上也花了很多时间,唯一的方法就是深入研究兔子孔并直接与Tensorflow一起进行LSTM滚动预测.

首先,要明确术语, batch_size 通常是指一起训练的序列数,而 num_steps 是指一起训练的时间步长.当您指的是 batch_size = 1 并只是预测下一个值"时,我想您的意思是使用 num_steps = 1 进行预测.

否则,应该可以使用 batch_size = 50 进行训练和预测,这意味着您正在训练50个序列,并在每个时间步长进行50次预测,每个序列进行一次预测(意味着训练/预测 num_steps = 1 ).

但是,我想您的意思是您想使用有状态LSTM来训练 num_steps = 50 ,并使用 num_steps = 1 进行预测.从理论上讲,这是有道理的,应该是可能的,而且使用Tensorflow可能,而不是Keras.

问题:对于有状态的RNN,Keras需要明确的批处理大小.您必须指定batch_input_shape(batch_size,num_steps,要素).

原因:Keras必须在计算图中分配形状为(batch_size,num_units)形状的固定大小的隐藏状态矢量,以便在训练批次之间持久保存值.另一方面,当 stateful = False 时,可以在每个批处理的开头将隐藏状态向量动态地初始化为零,因此它不必为固定大小.此处有更多详细信息: http://philipperemy.github.io/keras-stateful-lstm/

可能的解决方法:使用 num_steps = 1 进行训练和预测.示例: https://github.com/keras-team/keras/blob/master/examples/lstm_stateful.py .对于您的问题,这可能完全有效,也可能根本不起作用,因为仅在一个时间步长上计算反向传播的梯度.参见: https://github.com/fchollet/keras/issues/3669

我的解决方案:使用Tensorflow :在Tensorflow中,您可以使用 batch_size = 50,num_steps = 100 进行训练,然后使用 batch_size = 1,num_steps =1 .这可以通过为共享相同RNN权重矩阵的训练和预测创建不同的模型图来实现.请参见以下示例以了解下一个字符预测: https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/model.py#L11 和博客文章

Edit: Once I set the model to stateful=False then I am able to use different batch sizes for fitting/training and prediction. What is the reason for this?

Unfortunately what you want to do is impossible with Keras ... I've also struggle a lot of time on this problems and the only way is to dive into the rabbit hole and work with Tensorflow directly to do LSTM rolling prediction.

First, to be clear on terminology, batch_size usually means number of sequences that are trained together, and num_steps means how many time steps are trained together. When you mean batch_size=1 and "just predicting the next value", I think you meant to predict with num_steps=1.

Otherwise, it should be possible to train and predict with batch_size=50 meaning you are training on 50 sequences and make 50 predictions every time step, one for each sequence (meaning training/prediction num_steps=1).

However, I think what you mean is that you want to use stateful LSTM to train with num_steps=50 and do prediction with num_steps=1. Theoretically this make senses and should be possible, and it is possible with Tensorflow, just not Keras.

The problem: Keras requires an explicit batch size for stateful RNN. You must specify batch_input_shape (batch_size, num_steps, features).

The reason: Keras must allocate a fixed-size hidden state vector in the computation graph with shape (batch_size, num_units) in order to persist the values between training batches. On the other hand, when stateful=False, the hidden state vector can be initialized dynamically with zeroes at the beginning of each batch so it does not need to be a fixed size. More details here: http://philipperemy.github.io/keras-stateful-lstm/

Possible work around: Train and predict with num_steps=1. Example: https://github.com/keras-team/keras/blob/master/examples/lstm_stateful.py. This might or might not work at all for your problem as the gradient for back propagation will be computed on only one time step. See: https://github.com/fchollet/keras/issues/3669

My solution: use Tensorflow: In Tensorflow you can train with batch_size=50, num_steps=100, then do predictions with batch_size=1, num_steps=1. This is possible by creating a different model graph for training and prediction sharing the same RNN weight matrices. See this example for next-character prediction: https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/model.py#L11 and blog post http://karpathy.github.io/2015/05/21/rnn-effectiveness/. Note that one graph can still only work with one specified batch_size, but you can setup multiple model graphs sharing weights in Tensorflow.

这篇关于为什么用于预测的Keras LSTM批次大小必须与拟合批次大小相同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆