具有TimeSeriesGenerator的Keras LSTM的自定义数据生成器 [英] Custom Data Generator for Keras LSTM with TimeSeriesGenerator

查看:194
本文介绍了具有TimeSeriesGenerator的Keras LSTM的自定义数据生成器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我尝试将Keras的 fit_generator 与自定义数据生成器配合使用, LSTM网络.

So I'm trying to use Keras' fit_generator with a custom data generator to feed into an LSTM network.

为说明问题,我创建了一个玩具示例,尝试以简单的升序预测下一个数字,并使用Keras

To illustrate the problem, I have created a toy example trying to predict the next number in a simple ascending sequence, and I use the Keras TimeseriesGenerator to create a Sequence instance:

WINDOW_LENGTH = 4
data = np.arange(0,100).reshape(-1,1)
data_gen = TimeseriesGenerator(data, data, length=WINDOW_LENGTH,
                               sampling_rate=1, batch_size=1)

我使用一个简单的LSTM网络:

I use a simple LSTM network:

data_dim = 1
input1 = Input(shape=(WINDOW_LENGTH, data_dim))
lstm1 = LSTM(100)(input1)
hidden = Dense(20, activation='relu')(lstm1)
output = Dense(data_dim, activation='linear')(hidden)

model = Model(inputs=input1, outputs=output)
model.compile(loss='mse', optimizer='rmsprop', metrics=['accuracy'])

并使用fit_generator函数对其进行训练:

and train it using the fit_generator function:

model.fit_generator(generator=data_gen,
                    steps_per_epoch=32,
                    epochs=10)

这可以很好地训练,并且模型可以按预期进行预测.

And this trains perfectly, and the model makes predictions as expected.

现在的问题是,在非玩具情况下,我想先处理从TimeseriesGenerator发出的数据,然后再将数据输入到fit_generator中.为此,我创建了一个生成器函数,该函数只包装了先前使用的TimeseriesGenerator.

Now the problem is, in my non-toy situation I want to process the data coming out from the TimeseriesGenerator before feeding the data into the fit_generator. As a step towards this, I create a generator function which just wraps the TimeseriesGenerator used previously.

def get_generator(data, targets, window_length = 5, batch_size = 32):
    while True:
        data_gen = TimeseriesGenerator(data, targets, length=window_length, 
                                       sampling_rate=1, batch_size=batch_size)
        for i in range(len(data_gen)):
            x, y = data_gen[i]
            yield x, y

data_gen_custom = get_generator(data, data,
                                window_length=WINDOW_LENGTH, batch_size=1)

但是现在奇怪的是,当我像以前一样训练模型时,却使用此生成器作为输入,

But now the strange thing is that when I train the model as before, but using this generator as the input,

model.fit_generator(generator=data_gen_custom,
                    steps_per_epoch=32,
                    epochs=10)

没有错误,但是训练错误无处不在(上下跳跃,而不是像其他方法一样持续下降),并且该模型无法学会做出正确的预测.

There is no error but the training error is all over the place (jumping up and down instead of consistently going down like it did with the other approach), and the model doesn't learn to make good predictions.

有什么想法我在使用自定义生成器方法时会出错吗?

Any ideas what I'm doing wrong with my custom generator approach?

推荐答案

这可能是因为对象类型已从Sequence更改为通用生成器,而SequenceTimeseriesGenerator的意思. fit_generator函数将这些区别对待.一个更干净的解决方案是继承该类并覆盖处理位:

It could be because the object type is changed from Sequence which is what a TimeseriesGenerator is to a generic generator. The fit_generator function treats these differently. A cleaner solution would be to inherit the class and override the processing bit:

class CustomGen(TimeseriesGenerator):
  def __getitem__(self, idx):
    x, y = super()[idx]
    # do processing here
    return x, y

并像以前一样使用此类,因为其余内部逻辑将保持不变.

And use this class like before as the rest of internal logic will remain the same.

这篇关于具有TimeSeriesGenerator的Keras LSTM的自定义数据生成器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆