Keras LSTM 预测的时间序列被压缩和移位 [英] Keras LSTM predicted timeseries squashed and shifted

查看:44
本文介绍了Keras LSTM 预测的时间序列被压缩和移位的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在假期期间亲身体验 Keras,我想我会从教科书的股票数据时间序列预测示例开始.所以我想要做的是给出过去 48 小时的平均价格变化(自前一个小时以来的百分比),预测未来一小时的平均价格变化是多少.

I'm trying to get some hands on experience with Keras during the holidays, and I thought I'd start out with the textbook example of timeseries prediction on stock data. So what I'm trying to do is given the last 48 hours worth of average price changes (percent since previous), predict what the average price chanege of the coming hour is.

然而,当针对测试集(甚至训练集)进行验证时,预测序列的幅度相差甚远,有时会转变为始终为正或始终为负,即偏离 0% 变化,我认为这对于这种事情是正确的.

However, when verifying against the test set (or even the training set) the amplitude of the predicted series is way off, and sometimes is shifted to be either always positive or always negative, i.e., shifted away from the 0% change, which I think would be correct for this kind of thing.

我想出了以下最小示例来说明问题:

I came up with the following minimal example to show the issue:

df = pandas.DataFrame.from_csv('test-data-01.csv', header=0)
df['pct'] = df.value.pct_change(periods=1)

seq_len=48
vals = df.pct.values[1:] # First pct change is NaN, skip it
sequences = []
for i in range(0, len(vals) - seq_len):
    sx = vals[i:i+seq_len].reshape(seq_len, 1)
    sy = vals[i+seq_len]
    sequences.append((sx, sy))

row = -24
trainSeqs = sequences[:row]
testSeqs = sequences[row:]

trainX = np.array([i[0] for i in trainSeqs])
trainy = np.array([i[1] for i in trainSeqs])

model = Sequential()
model.add(LSTM(25, batch_input_shape=(1, seq_len, 1)))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
model.fit(trainX, trainy, epochs=1, batch_size=1, verbose=1, shuffle=True)

pred = []
for s in trainSeqs:
    pred.append(model.predict(s[0].reshape(1, seq_len, 1)))
pred = np.array(pred).flatten()

plot(pred)
plot([i[1] for i in trainSeqs])
axis([2500, 2550,-0.03, 0.03])

如您所见,我创建了训练和测试序列,方法是选择过去 48 小时,然后将下一步放入一个元组中,然后前进 1 小时,重复该过程.该模型是一个非常简单的 1 个 LSTM 和 1 个密集层.

As you can see, I create training and testing sequences, by selecting the last 48 hours, and the next step into a tuple, and then advancing 1 hour, repeating the procedure. The model is a very simple 1 LSTM and 1 dense layer.

我本来希望单个预测点的图与训练序列的图很好地重叠(毕竟这是他们训练的同一个集合),并且与测试序列有某种匹配.但是我在训练数据上得到以下结果:

I would have expected the plot of individual predicted points to overlap pretty nicely the plot of training sequences (after all this is the same set they were trained on), and sort of match for the test sequences. However I get the following result on training data:

  • 橙色:真实数据
  • 蓝色:预测数据

知道会发生什么吗?我是不是误会了什么?

Any idea what might be going on? Did I misunderstand something?

更新:为了更好地展示我所说的移位和压缩的含义,我还绘制了预测值,方法是将其移回以匹配真实数据并相乘以匹配幅度.

Update: to better show what I mean by shifted and squashed I also plotted the predicted values by shifting it back to match the real data and multiplied to match the amplitude.

plot(pred*12-0.03)
plot([i[1] for i in trainSeqs])
axis([2500, 2550,-0.03, 0.03])

正如您所看到的,预测与真实数据非常吻合,只是以某种方式被压缩和抵消了,我不知道为什么.

As you can see the prediction nicely fits the real data, it's just squashed and offset somehow, and I can't figure out why.

推荐答案

我猜你是过拟合了,因为你的数据的维数是 1,而一个 25 个单元的 LSTM 看起来相当复杂低维数据集.以下是我会尝试的一些事情:

I presume you are overfitting, since the dimensionality of your data is 1, and a LSTM with 25 units seems rather complex for such a low-dimensional dataset. Here's a list of things that I would try:

  • 减少 LSTM 维度.
  • 添加某种形式的正则化以防止过度拟合.例如,dropout 可能是一个不错的选择.
  • 训练更多时期或改变学习率.模型可能需要更多的 epochs 或更大的更新才能找到合适的参数.
  • Decreasing the LSTM dimension.
  • Adding some form of regularization to combat overfitting. For example, dropout might be a good choice.
  • Training for more epochs or changing the learning rate. The model might need more epochs or bigger updates to find the appropriate parameters.

更新.让我总结一下我们在评论部分讨论的内容.

UPDATE. Let me summarize what we discussed in the comments section.

为了澄清起见,第一个图没有显示验证集的预测序列,而是训练集的预测序列.因此,我的第一个过拟合解释可能不准确.我认为一个合适的问题是:是否真的有可能从这样一个低维数据集预测未来的价格变化?机器学习算法并不神奇:只有存在模式,它们才会在数据中找到模式.

Just for clarification, the first plot doesn't show the predicted series for a validation set, but for the training set. Therefore, my first overfitting interpretation might be inaccurate. I think an appropriate question to ask would be: is it actually possible to predict the future price change from such a low-dimensional dataset? Machine learning algorithms aren't magical: they'll find patterns in the data only if they exist.

如果过去的价格变化确实不能很好地说明未来的价格变化,那么:

If the past price change alone is indeed not very informative of the future price change then:

  • 您的模型将学习预测价格变化的均值(可能在 0 左右),因为这是在缺乏信息特征的情况下产生最低损失的值.
  • 预测可能看起来有些偏移",因为时间步长 t+1 的价格变化与时间步长 t 的价格变化略有相关(但是,预测接近 0 的值仍然是最安全的选择).这确实是我作为一个非专家能够观察到的唯一模式(即时间步长 t+1 的值有时与时间步长 t 的值相似).

如果时间步长 t 和 t+1 的值碰巧通常更相关,那么我认为模型会对这种相关性更有信心并且预测的幅度会更大.

If values at timesteps t and t+1 happened to be more correlated in general, then I presume that the model would be more confident about this correlation and the amplitude of the prediction would be bigger.

这篇关于Keras LSTM 预测的时间序列被压缩和移位的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆