正确的方法将数据拆分为Keras状态RNN的批次 [英] Correct way to split data to batches for Keras stateful RNNs

查看:95
本文介绍了正确的方法将数据拆分为Keras状态RNN的批次的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

文档声明

批次中索引i处每个样本的最后状态将用作 下一批索引i的样本的初始状态

the last state for each sample at index i in a batch will be used as initial state for the sample of index i in the following batch

是否表示要将数据拆分为批处理,我需要按以下方式进行操作 例如假设我正在训练一个有状态的RNN,以根据给定的前一个来预测range(0,5)中的下一个整数

does it mean that to split data to batches I need to do it the following way e.g. let's assume that I am training a stateful RNN to predict the next integer in range(0, 5) given the previous one

# batch_size = 3
# 0, 1, 2 etc in x are samples (timesteps and features omitted for brevity of the example)
x = [0, 1, 2, 3, 4]
y = [1, 2, 3, 4, 5]

batches_x = [[0, 1, 2], [1, 2, 3], [2, 3, 4]]
batches_y = [[1, 2, 3], [2, 3, 4], [3, 4, 5]]

然后在x [0,0]上学习后的状态将是x [1,0]的初始状态 和x [0,1]表示x [1,1](0表示1,1表示2等)?

then the state after learning on x[0, 0] will be initial state for x[1, 0] and x[0, 1] for x[1, 1] (0 for 1 and 1 for 2 etc)?

这是正确的方法吗?

推荐答案

基于

Based on this answer, for which I performed some tests.

状态为假:

通常(状态为False),您有一批包含许多序列:

Normally (stateful=False), you have one batch with many sequences:

batch_x = [
            [[0],[1],[2],[3],[4],[5]],
            [[1],[2],[3],[4],[5],[6]],
            [[2],[3],[4],[5],[6],[7]],
            [[3],[4],[5],[6],[7],[8]]
          ]

形状为(4,6,1).这意味着您拥有:

The shape is (4,6,1). This means that you have:

  • 1批
  • 4个单独的序列=这是批处理大小,可以变化
  • 每个序列6个步骤
  • 每步1个功能

每次训练时,如果重复此批次或通过新批次,它将看到各个序列.每个序列都是唯一的条目.

Every time you train, either if you repeat this batch or if you pass a new one, it will see individual sequences. Every sequence is a unique entry.

状态=真:

当您进入有状态层时,您将不再传递单个序列.您将传递很长的序列,分为小批.您将需要更多批次:

When you go to a stateful layer, You are not going to pass individual sequences anymore. You are going to pass very long sequences divided in small batches. You will need more batches:

batch_x1 = [
             [[0],[1],[2]],
             [[1],[2],[3]],
             [[2],[3],[4]],
             [[3],[4],[5]]
           ]
batch_x2 = [
             [[3],[4],[5]], #continuation of batch_x1[0]
             [[4],[5],[6]], #continuation of batch_x1[1]
             [[5],[6],[7]], #continuation of batch_x1[2]
             [[6],[7],[8]]  #continuation of batch_x1[3]
           ]

两个形状均为(4,3,1).这意味着您拥有:

Both shapes are (4,3,1). And this means that you have:

  • 2批
  • 4个单独的序列=这是批处理大小,并且必须恒定
  • 每个序列6步(每批次3步)
  • 每步1个功能

有状态层是指庞大的序列,其长度足以超出您的内存或某些任务的可用时间.然后,您可以对序列进行切片并分批处理.结果没有差异,该层不更智能或具有其他功能.它只是不认为序列在处理一批后就结束了.它期望这些序列的延续.

The stateful layers are meant to huge sequences, long enough to exceed your memory or your available time for some task. Then you slice your sequences and process them in parts. There is no difference in the results, the layer is not smarter or has additional capabilities. It just doesn't consider that the sequences have ended after it processes one batch. It expects the continuation of those sequences.

在这种情况下,您可以确定序列何时结束并手动调用model.reset_states().

In this case, you decide yourself when the sequences have ended and call model.reset_states() manually.

这篇关于正确的方法将数据拆分为Keras状态RNN的批次的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆