具有不同形状的X和y的Tensorflow keras时间序列预测 [英] Tensorflow keras timeseries prediction with X and y having different shapes

查看:294
本文介绍了具有不同形状的X和y的Tensorflow keras时间序列预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用具有不同尺寸的Xy的张量流和keras进行时间序列预测:

I am trying to do time series prediction with tensorflow and keras with X and y having different dimensions:

X.shape = (5000, 12)
y.shape = (5000, 3, 12)

当我执行以下操作

n_input = 7
generator = TimeseriesGenerator(X, y, length=n_input, batch_size=1)

for i in range(5):
    x_, y_ = generator[i]
    print(x_.shape)
    print(y_.shape)

我得到期望的输出

(1, 7, 12)
(1, 3, 12)
(1, 7, 12)
(1, 3, 12)
...

这是因为我的数据是气象数据,所以我有5000天用于数组训练.X我使用7天的滑动窗口,每天包含12个要素(气压,温度,湿度等).在目标数组y中,我有3天的滑动窗口,试图预测接下来3天到7天的每个窗口.

This is because my data is meteorological, I have 5000 days, for training in the array X I use a sliding window of 7 days, with each day containing 12 features (air pressure, temperature, humidity a.o.). And in the target array y I have sliding windows of 3 days, trying to predict the next 3 days to each window of 7 days.

但是当我尝试拟合模型时,由于Xy数组的形状不匹配,我得到了一个错误:

But then when I try to fit the model I get an error due to the mismatch in the shape of the X and y arrays:

model = Sequential()
model.add(LSTM(4, input_shape=(None, 12)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
history = model.fit_generator(generator, epochs=3).history

ValueError: A target array with shape (1, 3, 12) was passed for an output of shape (None, 1) while using as loss `mean_squared_error`. This loss expects targets to have the same shape as the output.

那么有没有一种方法可以针对尺寸的不匹配来调整体系结构?还是有一种方法可以重塑Xy以使其与该体系结构一起使用?我尝试将后期的X重塑为(5000, 7, 12),但这也带来了尺寸错误. Tnx

So is there a way to adjust the architecture for the mismatch in the dimensions? Or is there a way to reshape X and y to make them work with this architecture? I tried the late reshaping X into (5000, 7, 12), but this gave also a dimensionality error. Tnx

推荐答案

您的生成器是正确的……这是您的网络无法正常工作.

your generator is correct... it's your network that doesn't work.

您没有正确处理尺寸.您正在处理序列,因此您需要在LSTM单元格中添加return_sequences=True.您的输入有7个时间步长,而您的输出有3个时间步长,您必须从7传递到3(可以通过合并等方式完成此操作).

you don't handle the dimensionality correctly. you are dealing with sequences so you need to impose return_sequences=True in your LSTM cells. your input has 7 timesteps while your output has 3 timesteps, you have to pass from 7 to 3 (you can do it with pooling and so on).

下面是一个虚拟示例.我不使用合并操作,而只是选择序列的一部分以获得3个时间步长的输出

below a dummy example. I don't use a pooling operation but simply select a part of the sequence in order to get an output of 3 timesteps

X = np.random.uniform(0,1, (5000, 12))
y = np.random.uniform(0,1, (5000, 3, 12))

n_input = 7
generator = tf.keras.preprocessing.sequence.TimeseriesGenerator(X, y, length=n_input, batch_size=32)

model = Sequential()
model.add(LSTM(4, return_sequences=True, input_shape=(n_input, 12)))
model.add(Lambda(lambda x: x[:,-3:,:]))
model.add(Dense(12))
model.compile(loss='mean_squared_error', optimizer='adam')

model.summary()

model.fit(generator, epochs=2)

这是一个池化操作的例子

here an example with pooling operation

model = Sequential()
model.add(LSTM(4, return_sequences=True, input_shape=(n_input, 12)))
model.add(MaxPool1D(2)) # also AvgPool1D is ok
model.add(Dense(12))
model.compile(loss='mean_squared_error', optimizer='adam')

model.summary()
model.fit(generator, epochs=2)

这里是一个示例,其中return_sequences = False并重复向量

here an example with return_sequences=False and repeat vector

model = Sequential()
model.add(LSTM(4, return_sequences=False, input_shape=(n_input, 12)))
model.add(RepeatVector(3))
model.add(Dense(12))
model.compile(loss='mean_squared_error', optimizer='adam')

model.summary()
model.fit(generator, epochs=2)

这篇关于具有不同形状的X和y的Tensorflow keras时间序列预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆