为什么用于预测的 Keras LSTM 批量大小必须与拟合批量大小相同? [英] Why does Keras LSTM batch size used for prediction have to be the same as fitting batch size?

查看:36
本文介绍了为什么用于预测的 Keras LSTM 批量大小必须与拟合批量大小相同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当使用 Keras LSTM 预测时间序列数据时,当我尝试使用 50 的批量大小训练模型时出现错误,然后尝试使用批量大小对同一模型进行预测1(即只是预测下一个值).

为什么我不能一次用多个批次训练和拟合模型,然后使用该模型来预测除相同批次大小之外的任何情况.这似乎没有意义,但我很容易错过这方面的一些东西.

这是模型.batch_size为50,sl为序列长度,目前设置为20.

 模型 = Sequential()模型.添加(LSTM(1,batch_input_shape=(batch_size,1,sl),有状态=真))模型.添加(密集(1))model.compile(loss='mean_squared_error', 优化器='adam')模型.fit(trainX,trainY,epochs=epochs,batch_size=batch_size,verbose=2)

这是对 RMSE 训练集进行预测的行

 # 进行预测trainPredict = model.predict(trainX,batch_size=batch_size)

这里是对未知时间步长的实际预测

for i in range(test_len):打印('预测%s:'%str(pred_count))next_pred_res = np.reshape(next_pred, (next_pred.shape[1], 1, next_pred.shape[0]))# 作出预测predictPredict = model.predict(next_pred_res,batch_size=1)predictPredictInv = scaler.inverse_transform(forecastPredict)predicts.append(forecastPredictInv)next_pred = next_pred[1:]next_pred = np.concatenate([next_pred, forecastPredict])pred_count += 1

这个问题与线路有关:

forecastPredict = model.predict(next_pred_res, batch_size=batch_size)

这里batch_size设置为1时的错误是:

ValueError: 无法为 Tensor 'lstm_1_input:0' 提供形状 (1, 1, 2) 的值,其形状为 '(10, 1, 2)' 这与以下错误相同当这里的 batch_size 设置为 50 时抛出,就像其他批量大小一样.

总误差为:

 forecastPredict = model.predict(next_pred_res, batch_size=1)文件/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/models.py",第 899 行,在预测中返回 self.model.predict(x,batch_size=batch_size,verbose=verbose)文件/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/engine/training.py",第 1573 行,在预测中批量大小=批量大小,详细=详细)文件/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/engine/training.py",第 1203 行,在 _predict_loop 中batch_outs = f(ins_batch)文件/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py",第 2103 行,在 __call__ 中feed_dict=feed_dict)运行中的文件/home/entelechy/tf_keras/lib/python3.5/site-packages/tensorflow/python/client/session.py",第 767 行run_metadata_ptr)文件/home/entelechy/tf_keras/lib/python3.5/site-packages/tensorflow/python/client/session.py",第 944 行,在 _run% (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))ValueError:无法为形状为(10, 1, 2)"的张量lstm_1_input:0"提供形状 (1, 1, 2) 的值

一旦我将模型设置为 stateful=False,我就可以使用不同的批量大小进行拟合/训练和预测.这是什么原因?

解决方案

不幸的是,你想要用 Keras 做的事情是不可能的......我也在这个问题上挣扎了很多时间,唯一的方法就是深入研究兔子洞,直接用 Tensorflow 做 LSTM 滚动预测.

首先,要明确术语,batch_size 通常表示一起训练的序列数,num_steps 表示一起训练的时间步数.当您的意思是 batch_size=1 和只是预测下一个值"时,我认为您的意思是使用 num_steps=1 进行预测.

否则,应该可以使用 batch_size=50 进行训练和预测,这意味着您正在对 50 个序列进行训练,并且每个时间步进行 50 个预测,每个序列一个(意味着训练/预测 num_steps=1).

但是,我认为您的意思是您想使用有状态 LSTM 来训练 num_steps=50 并使用 num_steps=1 进行预测.从理论上讲,这是有道理的,应该是可能的,而且 Tensorflow 可以实现,而 Keras 不行.

问题:Keras 需要有状态 RNN 的明确批大小.您必须指定batch_input_shape (batch_size, num_steps, features).

原因:Keras 必须在形状为 (batch_size, num_units) 的计算图中分配一个固定大小的隐藏状态向量,以便在训练批次之间保持值.另一方面,当 stateful=False 时,隐藏状态向量可以在每批开始时动态初始化为零,因此不需要固定大小.更多详情请见:http://philipperemy.github.io/keras-stateful-lstm/

可能的解决方法:使用 num_steps=1 进行训练和预测.示例:https://github.com/keras-team/keras/blob/master/examples/lstm_stateful.py.这对于您的问题可能完全有效,也可能根本无效,因为反向传播的梯度将仅在一个时间步长上计算.请参阅:https://github.com/fchollet/keras/issues/3669>

我的解决方案:使用 Tensorflow:在 Tensorflow 中,您可以使用 batch_size=50, num_steps=100 进行训练,然后使用 batch_size=1, num_steps= 进行预测1.这可以通过为共享相同 RNN 权重矩阵的训练和预测创建不同的模型图来实现.请参阅下一个字符预测示例:https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/model.py#L11 和博客文章 http://karpathy.github.io/2015/05/21/rnn-effectiveness/.请注意,一个图仍然只能使用一个指定的batch_size,但您可以在 Tensorflow 中设置多个模型图共享权重.

When using a Keras LSTM to predict on time series data I've been getting errors when I'm trying to train the model using a batch size of 50, while then trying to predict on the same model using a batch size of 1 (ie just predicting the next value).

Why am I not able to train and fit the model with multiple batches at once, and then use that model to predict for anything other than the same batch size. It doesn't seem to make sense, but then I could easily be missing something about this.

Edit: this is the model. batch_size is 50, sl is sequence length, which is set at 20 currently.

    model = Sequential()
    model.add(LSTM(1, batch_input_shape=(batch_size, 1, sl), stateful=True))
    model.add(Dense(1))
    model.compile(loss='mean_squared_error', optimizer='adam')
    model.fit(trainX, trainY, epochs=epochs, batch_size=batch_size, verbose=2)

here is the line for predicting on the training set for RMSE

    # make predictions
    trainPredict = model.predict(trainX, batch_size=batch_size)

here is the actual prediction of unseen time steps

for i in range(test_len):
    print('Prediction %s: ' % str(pred_count))

    next_pred_res = np.reshape(next_pred, (next_pred.shape[1], 1, next_pred.shape[0]))
    # make predictions
    forecastPredict = model.predict(next_pred_res, batch_size=1)
    forecastPredictInv = scaler.inverse_transform(forecastPredict)
    forecasts.append(forecastPredictInv)
    next_pred = next_pred[1:]
    next_pred = np.concatenate([next_pred, forecastPredict])

    pred_count += 1

This issue is with the line:

forecastPredict = model.predict(next_pred_res, batch_size=batch_size)

The error when batch_size here is set to 1 is:

ValueError: Cannot feed value of shape (1, 1, 2) for Tensor 'lstm_1_input:0', which has shape '(10, 1, 2)' which is the same error that throws when batch_size here is set to 50 like the other batch sizes as well.

The total error is:

    forecastPredict = model.predict(next_pred_res, batch_size=1)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/models.py", line 899, in predict
    return self.model.predict(x, batch_size=batch_size, verbose=verbose)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/engine/training.py", line 1573, in predict
    batch_size=batch_size, verbose=verbose)
   File "/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/engine/training.py", line 1203, in _predict_loop
    batch_outs = f(ins_batch)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py", line 2103, in __call__
    feed_dict=feed_dict)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 767, in run
    run_metadata_ptr)
  File "/home/entelechy/tf_keras/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 944, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (1, 1, 2) for Tensor 'lstm_1_input:0', which has shape '(10, 1, 2)'

Edit: Once I set the model to stateful=False then I am able to use different batch sizes for fitting/training and prediction. What is the reason for this?

解决方案

Unfortunately what you want to do is impossible with Keras ... I've also struggle a lot of time on this problems and the only way is to dive into the rabbit hole and work with Tensorflow directly to do LSTM rolling prediction.

First, to be clear on terminology, batch_size usually means number of sequences that are trained together, and num_steps means how many time steps are trained together. When you mean batch_size=1 and "just predicting the next value", I think you meant to predict with num_steps=1.

Otherwise, it should be possible to train and predict with batch_size=50 meaning you are training on 50 sequences and make 50 predictions every time step, one for each sequence (meaning training/prediction num_steps=1).

However, I think what you mean is that you want to use stateful LSTM to train with num_steps=50 and do prediction with num_steps=1. Theoretically this make senses and should be possible, and it is possible with Tensorflow, just not Keras.

The problem: Keras requires an explicit batch size for stateful RNN. You must specify batch_input_shape (batch_size, num_steps, features).

The reason: Keras must allocate a fixed-size hidden state vector in the computation graph with shape (batch_size, num_units) in order to persist the values between training batches. On the other hand, when stateful=False, the hidden state vector can be initialized dynamically with zeroes at the beginning of each batch so it does not need to be a fixed size. More details here: http://philipperemy.github.io/keras-stateful-lstm/

Possible work around: Train and predict with num_steps=1. Example: https://github.com/keras-team/keras/blob/master/examples/lstm_stateful.py. This might or might not work at all for your problem as the gradient for back propagation will be computed on only one time step. See: https://github.com/fchollet/keras/issues/3669

My solution: use Tensorflow: In Tensorflow you can train with batch_size=50, num_steps=100, then do predictions with batch_size=1, num_steps=1. This is possible by creating a different model graph for training and prediction sharing the same RNN weight matrices. See this example for next-character prediction: https://github.com/sherjilozair/char-rnn-tensorflow/blob/master/model.py#L11 and blog post http://karpathy.github.io/2015/05/21/rnn-effectiveness/. Note that one graph can still only work with one specified batch_size, but you can setup multiple model graphs sharing weights in Tensorflow.

这篇关于为什么用于预测的 Keras LSTM 批量大小必须与拟合批量大小相同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆