Keras:我应该如何为 RNN 准备输入数据? [英] Keras : How should I prepare input data for RNN?

查看:28
本文介绍了Keras:我应该如何为 RNN 准备输入数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在为 Keras 上的 RNN 准备输入数据时遇到问题.

I'm having trouble with preparing input data for RNN on Keras.

目前我的训练数据维度是:(6752, 600, 13)

Currently, my training data dimension is: (6752, 600, 13)

  • 6752:训练数据的数量
  • 600:时间步数
  • 13:特征向量的大小(向量在浮点数中)

X_trainY_train 都在这个维度上.

X_train and Y_train are both in this dimension.

我想准备将这些数据输入 Keras 上的 SimpleRNN.假设我们正在经历时间步长,从步骤 #0 到步骤 #599.假设我想使用 input_length = 5,这意味着我想使用最近的 5 个输入.(例如步骤 #10、#11、#12、#13、#14 @ 步骤 #14).

I want to prepare this data to be fed into SimpleRNN on Keras. Suppose that we're going through time steps, from step #0 to step #599. Let's say I want to use input_length = 5, which means that I want to use recent 5 inputs. (e.g. step #10, #11,#12,#13,#14 @ step #14).

我应该如何重塑X_train?

应该是 (6752, 5, 600, 13) 还是应该是 (6752, 600, 5, 13)?

should it be (6752, 5, 600, 13) or should it be (6752, 600, 5, 13)?

Y_train 应该是什么形状?

应该是 (6752, 600, 13) 还是 (6752, 1, 600, 13) 还是 (6752, 600, 1, 13)?

Should it be (6752, 600, 13) or (6752, 1, 600, 13) or (6752, 600, 1, 13)?

推荐答案

如果您只想使用最近的 5 个输入来预测输出,则无需提供任何训练样本的完整 600 个时间步长.我的建议是以下列方式传递训练数据:

If you only want to predict the output using the most recent 5 inputs, there is no need to ever provide the full 600 time steps of any training sample. My suggestion would be to pass the training data in the following manner:

             t=0  t=1  t=2  t=3  t=4  t=5  ...  t=598  t=599
sample0      |---------------------|
sample0           |---------------------|
sample0                |-----------------
...
sample0                                         ----|
sample0                                         ----------|
sample1      |---------------------|
sample1           |---------------------|
sample1                |-----------------
....
....
sample6751                                      ----|
sample6751                                      ----------|

训练序列的总数将总计为

The total number of training sequences will sum up to

(600 - 4) * 6752 = 4024192    # (nb_timesteps - discarded_tailing_timesteps) * nb_samples

每个训练序列由 5 个时间步组成.在每个序列的每个时间步,您都传递特征向量的所有 13 个元素.随后,训练数据的形状将为 (4024192, 5, 13).

Each training sequence consists of 5 time steps. At each time step of every sequence you pass all 13 elements of the feature vector. Subsequently, the shape of the training data will be (4024192, 5, 13).

这个循环可以重塑你的数据:

This loop can reshape your data:

input = np.random.rand(6752,600,13)
nb_timesteps = 5

flag = 0

for sample in range(input.shape[0]):
    tmp = np.array([input[sample,i:i+nb_timesteps,:] for i in range(input.shape[1] - nb_timesteps + 1)])

    if flag==0:
        new_input = tmp
        flag = 1

    else:
        new_input = np.concatenate((new_input,tmp))

这篇关于Keras:我应该如何为 RNN 准备输入数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆