Keras fit_generator()-时间序列的批处理如何工作? [英] Keras fit_generator() - How does batch for time series work?

查看:94
本文介绍了Keras fit_generator()-时间序列的批处理如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

上下文:

我目前正在使用带有Tensorflow后端的Keras进行时间序列预测,因此,研究了

I am currently working on time series prediction using Keras with Tensorflow backend and, therefore, studied the tutorial provided here.

在完成本教程之后,我介绍了fit_generator()方法的生成器. 该生成器生成的输出如下(左样本,右目标):

Following this tutorial, I came to the point where the generator for the fit_generator() method is described. The output this generator generates is as follows (left sample, right target):

[[[10. 15.]
  [20. 25.]]] => [[30. 35.]]     -> Batch no. 1: 2 Samples | 1 Target
  ---------------------------------------------
[[[20. 25.]
  [30. 35.]]] => [[40. 45.]]     -> Batch no. 2: 2 Samples | 1 Target
  ---------------------------------------------
[[[30. 35.]
  [40. 45.]]] => [[50. 55.]]     -> Batch no. 3: 2 Samples | 1 Target
  ---------------------------------------------
[[[40. 45.]
  [50. 55.]]] => [[60. 65.]]     -> Batch no. 4: 2 Samples | 1 Target
  ---------------------------------------------
[[[50. 55.]
  [60. 65.]]] => [[70. 75.]]     -> Batch no. 5: 2 Samples | 1 Target
  ---------------------------------------------
[[[60. 65.]
  [70. 75.]]] => [[80. 85.]]     -> Batch no. 6: 2 Samples | 1 Target
  ---------------------------------------------
[[[70. 75.]
  [80. 85.]]] => [[90. 95.]]     -> Batch no. 7: 2 Samples | 1 Target
  ---------------------------------------------
[[[80. 85.]
  [90. 95.]]] => [[100. 105.]]   -> Batch no. 8: 2 Samples | 1 Target

在本教程中使用了TimeSeriesGenerator,但是对于我的问题,如果使用自定义生成器或此类,它是次要的. 关于数据,我们有8个step_per_epoch和一个形状为(8、1、2、2)的样本. 生成器被馈送到由LSTM实现的递归神经网络.

In the tutorial the TimeSeriesGenerator was used, but for my question it is secondary if a custom generator or this class is used. Regarding the data, we have 8 steps_per_epoch and a sample of shape (8, 1, 2, 2). The generator is fed to a Recurrent Neural Network, implemented by an LSTM.

我的问题

fit_generator()每批次仅允许一个目标,如TimeSeriesGenerator所输出. 当我第一次了解fit()的批处理选项时,我以为我可以有多个样本和相应数量的目标(逐批处理,意味着逐行处理).但这是fit_generator()不允许的,因此显然是错误的. 例如:

fit_generator() only allows a single target per batch, as outputted by the TimeSeriesGenerator. When I first read about the option of batches for fit(), I thought that I could have multiple samples and a corresponding number of targets (which are processed batchwise, meaning row by row). But this is not allowed by fit_generator() and, therefore, obviously false. This would look for example like:

[[[10. 15. 20. 25.]]] => [[30. 35.]]     
[[[20. 25. 30. 35.]]] => [[40. 45.]]    
    |-> Batch no. 1: 2 Samples | 2 Targets
  ---------------------------------------------
[[[30. 35. 40. 45.]]] => [[50. 55.]]    
[[[40. 45. 50. 55.]]] => [[60. 65.]]    
    |-> Batch no. 2: 2 Samples | 2 Targets
  ---------------------------------------------
...

其次,我认为例如[10,15]和[20,25]被用作目标[30,35]的RNN的连续输入,这意味着这类似于输入[10, 15、20、25].由于RNN的输出使用第二种方法(我对其进行了测试)有所不同,因此这也必须是错误的结论.

Secondly, I thought that, for example, [10, 15] and [20, 25] were used as input for the RNN consecutively for the target [30, 35], meaning that this is analog to inputting [10, 15, 20, 25]. Since the output from the RNN differs using the second approach (I tested it), this also has to be a wrong conclusion.

因此,我的问题是:

  1. 为什么每批只允许一个目标(我知道有一些目标 解决方法,但必须有原因)?
  2. 我如何理解 计算一批?意思是,诸如[[[40, 45], [50, 55]]] => [[60, 65]]之类的某些输入是如何处理的,为什么不类似于 [[[40, 45, 50, 55]]] => [[60, 65]]
  1. Why is only one target per batch allowed (I know there are some workarounds, but there has to be a reason)?
  2. How may I understand the calculation of one batch? Meaning, how is some input like [[[40, 45], [50, 55]]] => [[60, 65]] processed and why is it not analog to [[[40, 45, 50, 55]]] => [[60, 65]]



根据今天的答案进行编辑
由于我对样本和目标的定义有些误解-我遵循我理解的Keras在说的时候试图告诉我的事情:



Edit according to todays answer
Since there is some misunderstanding about my definition of samples and targets - I follow what I understand Keras is trying to tell me when saying:

ValueError:输入数组应具有与目标数组相同数量的样本.找到1个输入样本和2个目标样本.

ValueError: Input arrays should have the same number of samples as target arrays. Found 1 input samples and 2 target samples.

例如,当我创建一个看起来像这样的批处理时,就会发生此错误:

This error occurs, when I create for example a batch which looks like:

#This is just a single batch - Multiple batches would be fed to fit_generator()
(array([[[0, 1, 2, 3, 4],
        [5, 6, 7, 8, 9]]]), 
                           array([[ 5,  6,  7,  8,  9],
                           [10, 11, 12, 13, 14]]))

这应该是一个批次,其中包含两个长度为5的时间序列(5个连续数据点/时间步长),其目标也是两个相应的序列. [ 5, 6, 7, 8, 9][0, 1, 2, 3, 4]的目标,而[10, 11, 12, 13, 14][5, 6, 7, 8, 9]的相应目标.
其中的样本形状为shape(number_of_batches, number_of_elements_per_batch, sequence_size),目标形状为shape(number_of_elements_per_batch, sequence_size).
Keras看到2个目标样本(在ValueError中),因为我有两个提供3D样本作为输入,提供2D目标作为输出(也许我只是不知道如何提供3D目标.).

This is supposed to be a single batch containing two time-sequences of length 5 (5 consecutive data points / time-steps), whose targets are also two corresponding sequences. [ 5, 6, 7, 8, 9] is the target of [0, 1, 2, 3, 4] and [10, 11, 12, 13, 14] is the corresponding target of [5, 6, 7, 8, 9].
The sample-shape in this would be shape(number_of_batches, number_of_elements_per_batch, sequence_size) and the target-shape shape(number_of_elements_per_batch, sequence_size).
Keras sees 2 target samples (in the ValueError), because I have two provide 3D-samples as input and 2D-targets as output (maybe I just don't get how to provide 3D-targets..).

无论如何,根据@todays的答复/评论,Keras将其解释为两个时间步长和五个功能.关于我的第一个问题(在这个编辑示例中,我仍然在其中看到序列作为序列的目标),我寻求信息以了解如何/是否可以实现这一目标以及该批次的外观(例如,我试图将其可视化为问题).

Anyhow, according to @todays answer/comments, this is interpreted as two timesteps and five features by Keras. Regarding my first question (where I still see a sequence as target to my sequence, as in this edit-example), I seek information how/if I can achieve this and how such a batch would look like (like I tried to visualize in the question).

推荐答案

简短答案:

为什么每批只允许一个目标(我知道有一些解决方法,但是必须有原因)?

Why is only one target per batch allowed (I know there are some workarounds, but there has to be a reason)?

完全不是这种情况.批次中目标样品的数量没有限制.唯一的要求是每个批次中的输入样品和目标样品的数量应相同.阅读详细答案,以进一步澄清.

That's not the case at all. There is no restriction on the number of target samples in a batch. The only requirement is that you should have the same number of input and target samples in each batch. Read the long answer for further clarification.

我如何理解一批的计算?意思是,如何处理诸如[[[40, 45], [50, 55]]] => [[60, 65]]之类的某些输入?为什么它与[[[40, 45, 50, 55]]] => [[60, 65]]不相似?

How may I understand the calculation of one batch? Meaning, how is some input like [[[40, 45], [50, 55]]] => [[60, 65]] processed and why is it not analog to [[[40, 45, 50, 55]]] => [[60, 65]]?

第一个是多元时间序列(即每个时间步具有多个特征),第二个是单变量时间序列(即每个时间步具有一个特征).因此它们不是等效的.阅读详细答案,以进一步澄清.

The first one is a multi-variate timeseries (i.e. each timestep has more than one features), and the second one is a uni-variate timeseris (i.e. each timestep has one feature). So they are not equivalent. Read the long answer for further clarification.

详细答案:

我将给出我在评论部分提到的答案,并尝试使用示例进行详细阐述:

我认为您正在混合样本,时间步,功能和目标.让我描述一下我的理解方式:在您提供的第一个示例中,似乎每个输入样本都包含2个时间步长,例如[10, 15][20, 25],其中每个时间步均包含两个功能,例如此外,对应的目标包括一个时间步长,例如10和15或20和25. [30, 35],它也有两个功能.换句话说,批次中的每个输入样本必须必须具有相应的目标.但是,每个输入样本的形状及其对应的目标可能不一定相同.

I think you are mixing samples, timesteps, features and targets. Let me describe how I understand it: in the first example you provided, it seems that each input sample consists of 2 timesteps, e.g. [10, 15] and [20, 25], where each timestep consists of two features, e.g. 10 and 15 or 20 and 25. Further, the corresponding target consists of one timestep, e.g. [30, 35], which also has two features. In other words, each input sample in a batch must have a corresponding target. However, the shape of each input sample and its corresponding target may not be necessarily the same.

例如,考虑一个模型,其中其输入和输出均为时间序列.如果将每个输入样本的形状表示为(input_num_timesteps, input_num_features),并且将每个目标(即输出)数组的形状表示为(output_num_timesteps, output_num_features),则将出现以下情况:

For example, consider a model where both its input and output are timeseries. If we denote the shape of each input sample as (input_num_timesteps, input_num_features) and the shape of each target (i.e. output) array as (output_num_timesteps, output_num_features), we would have the following cases:

1)输入和输出时间步数相同(即input_num_timesteps == output_num_timesteps).仅作为示例,以下模型可以实现此目的:

1) The number of input and output timesteps are the same (i.e. input_num_timesteps == output_num_timesteps). Just as an example, the following model could achieve this:

from keras import layers
from keras import models

inp = layers.Input(shape=(input_num_timesteps, input_num_features))

# a stack of RNN layers on top of each other (this is optional)
x = layers.LSTM(..., return_sequences=True)(inp)
# ...
x = layers.LSTM(..., return_sequences=True)(x)

# a final RNN layer that has `output_num_features` unit
out = layers.LSTM(output_num_features, return_sequneces=True)(x)

model = models.Model(inp, out)

2)输入和输出时间步数不同(即input_num_timesteps ~= output_num_timesteps).通常,这是通过首先使用一个或多个LSTM层的堆栈将输入时间序列编码为向量,然后重复该向量output_num_timesteps次以获得所需长度的时间序列来实现的.对于重复操作,我们可以轻松地在Keras中使用RepeatVector层.再次,作为示例,以下模型可以实现此目的:

2) The number of input and output timesteps are different (i.e. input_num_timesteps ~= output_num_timesteps). This is usually achieved by first encoding the input timeseries into a vector using a stack of one or more LSTM layers, and then repeating that vector output_num_timesteps times to get a timeseries of desired length. For the repeat operation, we can easily use RepeatVector layer in Keras. Again, just as an example, the following model could achieve this:

from keras import layers
from keras import models

inp = layers.Input(shape=(input_num_timesteps, input_num_features))

# a stack of RNN layers on top of each other (this is optional)
x = layers.LSTM(..., return_sequences=True)(inp)
# ...
x = layers.LSTM(...)(x)  # The last layer ONLY returns the last output of RNN (i.e. return_sequences=False)

# repeat `x` as needed (i.e. as the number of timesteps in output timseries)
x = layers.RepeatVector(output_num_timesteps)(x)

# a stack of RNN layers on top of each other (this is optional)
x = layers.LSTM(..., return_sequences=True)(x)
# ...
out = layers.LSTM(output_num_features, return_sequneces=True)(x)

model = models.Model(inp, out)

在一种特殊情况下,如果输出时间步长为1(例如,网络正在尝试根据给定的最后一个t时间步长来预测下一个时间步长),我们可能无需使用重复,而可以使用Dense层(在这种情况下,模型的输出形状将是(None, output_num_features),而不是(None, 1, output_num_features)):

As a special case, if the number of output timesteps is 1 (e.g. the network is trying to predict the next timestep given the last t timesteps), we may not need to use repeat and instead we can just use a Dense layer (in this case the output shape of the model would be (None, output_num_features), and not (None, 1, output_num_features)):

inp = layers.Input(shape=(input_num_timesteps, input_num_features))

# a stack of RNN layers on top of each other (this is optional)
x = layers.LSTM(..., return_sequences=True)(inp)
# ...
x = layers.LSTM(...)(x)  # The last layer ONLY returns the last output of RNN (i.e. return_sequences=False)

out = layers.Dense(output_num_features, activation=...)(x)

model = models.Model(inp, out)


请注意,上面提供的体系结构仅用于说明,您可能需要调整或改编它们,例如通过根据用例和要解决的问题添加更多的层,例如Dense层.


Note that the architectures provided above are just for illustration, and you may need to tune or adapt them, e.g. by adding more layers such as Dense layer, based on your use case and the problem you are trying to solve.

更新:问题是您在阅读我的评论和答案以及Keras提出的错误时没有足够的注意力.该错误明确指出:

Update: The problem is that you don't pay enough attention when reading, both my comments and answer as well as the error raised by Keras. The error clearly states that:

...找到1个输入样本和2个目标样本.

... Found 1 input samples and 2 target samples.

因此,在仔细阅读本文之后,如果我是您,我会对自己说:好吧,Keras认为输入批次有1个输入样本,但是我想我提供了两个样本!!人(!),我认为我很可能比Keras错了,所以让我们找出我在做错什么!".一个简单而快速的检查就是只检查输入数组的形状:

So, after reading this carefully, if I were you I would say to myself: "OK, Keras thinks that the input batch has 1 input sample, but I think I am providing two samples!! Since I am a very good person(!), I think it's very likely that I would be wrong than Keras, so let's find out what I am doing wrong!". A simple and quick check would be to just examine the shape of input array:

>>> np.array([[[0, 1, 2, 3, 4],
               [5, 6, 7, 8, 9]]]).shape
(1,2,5)

哦,它表示(1,2,5)!所以这意味着一个样本具有两个时间步长,每个时间步长具有五个功能!!!该数组由长度为5的两个样本组成,每个时间步长为1!所以我现在该怎么办???"好吧,您可以逐步解决它:

"Oh, it says (1,2,5)! So that means one sample which has two timesteps and each timestep has five features!!! So I was wrong into thinking that this array consists of two samples of length 5 where each timestep is of length 1!! So what should I do now???" Well, you can fix it, step-by-step:

# step 1: I want a numpy array
s1 = np.array([])

# step 2: I want it to have two samples
s2 = np.array([
               [],
               []
              ])

# step 3: I want each sample to have 5 timesteps of length 1 in them
s3 = np.array([
               [
                [0], [1], [2], [3], [4]
               ],
               [
                [5], [6], [7], [8], [9]
               ]
              ])

>>> s3.shape
(2, 5, 1)

Voila!我们做到了!这是输入数组;现在检查目标数组,它必须有两个长度为5的目标样本,每个样本均具有一个特征,即形状为(2, 5, 1):

Voila! We did it! This was the input array; now check the target array, it must have two target samples of length 5 each with one feature, i.e. having a shape of (2, 5, 1):

>>> np.array([[ 5,  6,  7,  8,  9],
              [10, 11, 12, 13, 14]]).shape
(2,5)

差不多!缺少最后一个尺寸(即1)(注意:取决于模型的体系结构,您可能需要也可能不需要最后一个轴).因此,我们可以使用上面的逐步方法来发现错误,或者可以更聪明一些,只需在末端添加一条轴即可.

Almost! The last dimension (i.e. 1) is missing (NOTE: depending on the architecture of your model you may or may not need that last axis). So we can use the step-by-step approach above to find our mistake, or alternatively we can be a bit clever and just add an axis to the end:

>>> t = np.array([[ 5,  6,  7,  8,  9],
                  [10, 11, 12, 13, 14]])
>>> t = np.expand_dims(t, axis=-1)
>>> t.shape
(2, 5, 1)

对不起,我无法解释得更好!但是无论如何,当您在我的注释和答案中一遍又一遍地看到某些东西(即输入/目标数组的形状)反复出现时,请假定它必须是重要的东西并应进行检查.

Sorry, I can't explain it better than this! But in any case, when you see that something (i.e. shape of input/target arrays) is repeated over and over in my comments and my answer, assume that it must be something important and should be checked.

这篇关于Keras fit_generator()-时间序列的批处理如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆