Keras中的LSTM序列预测只是在输入中输出最后一步 [英] LSTM Sequence Prediction in Keras just outputs last step in the input

查看:731
本文介绍了Keras中的LSTM序列预测只是在输入中输出最后一步的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用Tensorflow作为后端与Keras合作.我有一个如下所示的LSTM序列预测模型,用于预测数据序列中的前一个步骤(输入30个步骤[每个具有4个特征],输出预测的步骤31).

model = Sequential()

model.add(LSTM(
    input_dim=4,
    output_dim=75,
    return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(
    150,
    return_sequences=False))
model.add(Dropout(0.2))

model.add(Dense(
    output_dim=4))
model.add(Activation("linear"))

model.compile(loss="mse", optimizer="rmsprop")
return model

我遇到的问题是,在训练了模型并对其进行了测试(即使使用训练出的数据相同)之后,其输出实质上是输入的第30步.我的第一个想法是,至少对于这种相对简单的模型,数据的模式必须太复杂而无法准确预测,因此,它可以返回的最佳答案本质上是输入的最后一个元素.为了限制过度拟合的可能性,我尝试将训练时期降低到1,但是出现了相同的行为.我以前从未观察过这种行为,并且在获得成功结果之前,我一直使用这种类型的数据(就上下文而言,我使用的是从具有主动稳定器的复杂物理系统上的4个点采集的振动数据;使用了预测在pid循环中进行稳定处理,因此,至少在目前,为什么我现在使用一个更简单的模型来保持运行速度更快.)

这听起来像是最有可能的原因,还是有人有其他想法?有人见过这种行为吗?如果它有助于可视化,这里是一个振动点的预测与所需输出相比的预测结果(请注意,这些屏幕截图是在非常大的数据集的较小选择中进行缩放的,因为@MarcinMożejko注意到我没有对两者进行完全相同的缩放因此,图像之间的任何偏移 都是由于该原因,目的是显示每个图像内预测和真实数据之间的水平偏移):

...并与输入的第30步进行比较:

注意:Keras模型看到的每个数据点都是许多实际测量值的平均值,并且随时间推移处理了平均值窗口.这样做是因为振动数据在我可以测量的最小分辨率下极其混乱,所以我改用这种移动平均技术来预测较大的运动(无论如何,这都是更重要的抵消方法).这就是为什么第一张图像中的偏移量显示的是许多点而不是一个点,而是一个平均值"或100个单独的偏移点.

-----编辑1,用于从输入数据集'X_test,y_test'到上面显示的图的代码-----

model_1 = lstm.build_model()  # The function above, pulled from another file 'lstm'

model_1.fit(
    X_test,
    Y_test,
    nb_epoch=1)

prediction = model_1.predict(X_test)

temp_predicted_sensor_b = (prediction[:, 0] + 1) * X_b_orig[:, 0]

sensor_b_y = (Y_test[:, 0] + 1) * X_b_orig[:, 0]

plot_results(temp_predicted_sensor_b, sensor_b_y)
plot_results(temp_predicted_sensor_b, X_b_orig[:, 29])

对于上下文:

X_test.shape =(41541,30,4)

Y_test.shape =(41541,4)

X_b_orig是来自b传感器的原始(如上所述平均)的数据.当进行绘图以撤消归一化以改善预测时,将其乘以预测和输入数据.它的形状为(41541,30).

----编辑2 ----

以下是指向完整项目设置的链接,以演示此行为:

https://github.com/ebirck/lstm_sequence_prediction

解决方案

这是因为对于您的数据(股票数据?),第31个值的最佳预测是第30个值本身.模型正确且适合数据. 我也有类似的预测股票数据的经验.

I am currently working with Keras using Tensorflow as the backend. I have a LSTM Sequence Prediction model shown below that I am using to predict one step ahead in a data series (input 30 steps [each with 4 features], output predicted step 31).

model = Sequential()

model.add(LSTM(
    input_dim=4,
    output_dim=75,
    return_sequences=True))
model.add(Dropout(0.2))

model.add(LSTM(
    150,
    return_sequences=False))
model.add(Dropout(0.2))

model.add(Dense(
    output_dim=4))
model.add(Activation("linear"))

model.compile(loss="mse", optimizer="rmsprop")
return model

The issue I'm having is that after training the model and testing it - even with the same data it trained on - what it outputs is essentially the 30th step in the input. My first thought is the patterns of my data must be too complex to accurately predict, at least with this relatively simple model, so the best answer it can return is essentially the last element of the input. To limit the possibility of over-fitting I've tried turning training epochs down to 1 but the same behavior appears. I've never observed this behavior before though and I have worked with this type of data before with successful results (for context, I'm using vibration data taken from 4 points on a complex physical system that has active stabilizers; the prediction is used in a pid loop for stabilization hence why, at least for now, I'm using a simpler model to keep things fast).

Does that sound like the most likely cause, or does anyone have another idea? Has anyone seen this behavior before? In case it helps with visualization here is what the prediction looks like for one vibration point compared to the desired output (note, these screenshots are zoomed in smaller selections of a very large dataset - as @MarcinMożejko noticed I did not zoom quite the same both times so any offset between the images is due to that, the intent is to show the horizontal offset between the prediction and true data within each image):

...and compared to the 30th step of the input:

Note: Each data point seen by the Keras model is an average over many actual measurements with the window of the average processed along in time. This is done because the vibration data is extremely chaotic at the smallest resolution I can measure so instead I use this moving average technique to predict the larger movements (which are the more important ones to counteract anyway). That is why the offset in the first image appears as many points off instead of just one, it is 'one average' or 100 individual points of offset. .

-----Edit 1, code used to get from the input datasets 'X_test, y_test' to the plots shown above-----

model_1 = lstm.build_model()  # The function above, pulled from another file 'lstm'

model_1.fit(
    X_test,
    Y_test,
    nb_epoch=1)

prediction = model_1.predict(X_test)

temp_predicted_sensor_b = (prediction[:, 0] + 1) * X_b_orig[:, 0]

sensor_b_y = (Y_test[:, 0] + 1) * X_b_orig[:, 0]

plot_results(temp_predicted_sensor_b, sensor_b_y)
plot_results(temp_predicted_sensor_b, X_b_orig[:, 29])

For context:

X_test.shape = (41541, 30, 4)

Y_test.shape = (41541, 4)

X_b_orig is the raw (averaged as described above) data from the b sensor. This is multiplied by the prediction and input data when plotting to undo normalization I do to improve the prediction. It has shape (41541, 30).

----Edit 2----

Here is a link to a complete project setup to demonstrate this behavior:

https://github.com/ebirck/lstm_sequence_prediction

解决方案

That is because for your data(stock data?), the best prediction for 31st value is the 30th value itself.The model is correct and fits the data. I also have similar experience predicting the stock data.

这篇关于Keras中的LSTM序列预测只是在输入中输出最后一步的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆