LSTM RNN有趣的结果:训练和验证数据的滞后结果 [英] Interesting results from LSTM RNN : lagged results for train and validation data

查看:704
本文介绍了LSTM RNN有趣的结果:训练和验证数据的滞后结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为RNN/LSTM(无状态)的简介,我正在训练一个模型,该模型具有200天的先前数据(X)的序列,包括每日价格变化,每日量变化等,并且标签/YI具有价格从当前价格到4个月内的百分比变化.基本上我想估计市场方向,而不是100%准确.但是我得到了一些奇怪的结果...

As an introduction to RNN/LSTM (stateless) I'm training a model with sequences of 200 days of previous data (X), including things like daily price change, daily volume change, etc and for the labels/Y I have the % price change from current price to that in 4 months. Basically I want to estimate the market direction, not to be 100% accurate. But I'm getting some odd results...

当我用训练数据测试模型时,我注意到模型的输出与实际数据相比是完美的拟合,仅相差了 4个月:

When I then test my model with the training data, I notice the output from the model is a perfect fit when compared to the actual data, it just lags by exactly 4 months:

当我将数据移动4个月时,您会发现它非常合适.

When I shift the data by 4 months, you can see it's a perfect fit.

我显然可以理解,为什么训练期间的训练数据非常合适,但是为什么4个月会滞后呢?

I can obviously understand why the training data would be a very close fit as it has seen it all during training - but why the 4 months lag?

它对验证数据执行相同的操作(请注意我用红色框突出显示的区域以供将来参考)

It does the same thing with the validation data (note the area I highlighted with the red box for future reference):

时移:

它不像您期望的那样与训练数据紧密匹配,但依我的喜好仍然了-我只是认为它不可能如此精确(请参阅以红色矩形为例).我认为该模型只是一个幼稚的预测器,我只是无法弄清楚它可能/为什么这样做.

It's not as close-fitting as the training data, as you'd expect, but still too close for my liking - I just don't think it can be this accurate (see the little blip in the red rectangle as an example). I think the model is acting as a naive predictor, I just can't work out how/why it's possibly doing it.

要从验证数据中生成此输出,我输入了200个时间步长的序列,但是数据序列中没有任何内容说明4个月内的价格变化百分比-完全断开了连接,所以它如何>那么准确吗? 4个月的滞后时间显然是另一个提示,说明这里有些不对劲,我不知道该如何解释,但我怀疑两者之间是有联系的.

To generate this output from the validation data, I input a sequence of 200 timesteps, but there's nothing in the data sequence that says what the %price change will be in 4 months - it's entirely disconnected, so how is it so accurate? The 4-month lag is obviously another indicator that something's not right here, I don't know how to explain that, but I suspect the two are linked.

推荐答案

好的,我意识到自己的错误;我使用模型生成预测线的方式很幼稚.对于上图中的每个日期,我都会从模型中获得输出,然后将预测的百分比变化应用于该日期的实际价格-这将在4个月的时间内给出预测价格.

Okay, I realised my error; the way I was using the model to generate the forecast line was naive. For every date in the graph above, I was getting an output from the model, and then apply the forecasted % change to the actual price for that date - that would give predicted price in 4 months' time.

鉴于市场通常在4个月的时间内仅在0-3%(正负)的幅度内波动,这意味着我的预测总是会紧贴当前价格,仅滞后4个月.

Given the markets usually only move within a margin of 0-3% (plus or minus) over a 4 month period, that would mean my forecasts was always going to closely mirror the current price, just with a 4 month lag.

因此,在每个日期都将对预测输出进行重新计算,因此模型线永远不会偏离实际值;会是一样的,但是在0-3%(正负)的范围内.

So at every date the predicted output was being re-based, so the model line would never deviate far from the actual; it'd be the same, but within a margin of 0-3% (plus or minus).

真的,该图并不重要,它也无法反映我将如何使用输出的方式,因此,我将放弃尝试获得可视化表示,而专注于尝试查找不同的指标降低了验证损失.

Really, the graph isn't important, and it doesn't reflect the way I'll use the output anyway, so I'm going to ditch trying to get a visual representation, and concentrate on trying to find different metrics that lower the validation loss.

这篇关于LSTM RNN有趣的结果:训练和验证数据的滞后结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆