在时间序列数据集上使用循环神经网络进行预测 [英] Prediction using Recurrent Neural Network on Time series dataset

查看:26
本文介绍了在时间序列数据集上使用循环神经网络进行预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说明

给定一个包含 10 个序列的数据集 - 一个序列对应于一天的股票价值记录 - 其中每个构成 50 个股票价值的样本记录,从早上或上午 9:05 开始,以 5 分钟的间隔分开.但是,有一个额外的记录(第 51 个样本)仅在训练集中可用,它比 50 个样本记录中的最后一个记录样本晚 2 小时,而不是 5 分钟.需要为测试集预测第 51 个样本,其中还给出了前 50 个样本.

Given a dataset that has 10 sequences - a sequence corresponds to a day of stock value recordings - where each constitutes 50 sample recordings of stock values that are separated by 5 minute intervals starting from the morning or 9:05 am. However, there is one extra recording (the 51th sample) that is only available in the training set which is 2 hours later, not 5 minutes, than the last recorded sample in the 50 sample recordings. That 51th sample is required to be predicted for the testing set where the first 50 samples are also given.

我正在使用 pybrain 循环神经网络解决这个问题,该网络将序列组合在一起,以及每个样本的标签(或通常称为目标 y)x_i 是下一个时间步长 x_(i+1) 的样本——时间序列预测中的典型公式.

I am using the pybrain recurrent neural network for this problem that groups sequences together, and the label (or commonly known as the target y) of each sample x_i is the sample of the next time step x_(i+1) - a typical formulation in time series prediction.

例子

A sequence for one day is something like:

    Signal id    Time      value
        1     -  9:05   -   23
        2     -  9:10   -   31
        3     -  9:15   -   24
       ...    -  ...    -   ...
       50     -  13:15  -   15

Below is the 2 hour later label 'target' given for the training set 
and is required to be predicted for the testing set
       51     -  15:15   -   11

问题

现在我的循环神经网络 (RNN) 已经对这 10 个序列进行了训练,如果它遇到另一个序列,我将如何使用 RNN 来预测股票价值 2 小时 在序列中的最后一个样本之后?

Now that my recurrent neural network (RNN) has trained on these 10 sequences, if it confronts another sequence, how would I use the RNN to predict the stock values 2 hours after the last sample in the sequence ?

请注意,对于每个训练序列,我还有比最后一个样本库存值晚 2 小时",但我不确定如何将其纳入训练 RNN,因为它期望相同样本之间的时间间隔.谢谢!

Please note that I also have "2 hours later than the last sample stock values" for each of the training sequences but I am not sure how to incorporate that in training the RNN since it expects identical time intervals between samples. Thanks!

推荐答案

我希望我能帮到你


更成熟的长短时记忆 (LSTM) 神经网络非常适合此类任务.LSTM 能够检测股票价值图表"中常见的形状"和变化",并且有很多研究试图证明这种形状在现实生活中确实存在!请参阅此链接以获取示例.

The more mature Long Short Time Memory (LSTM) neural network is a great fit for this kind of task. LSTM is able to detect common "shapes" and "variations" in the stock value "graph", and there is A LOT of research which tries to prove that such shapes actually occur in real life! See this link for an example.

如果您希望网络达到更高的准确度,我建议您还向网络提供上一年(完全相同日期)的股票价值,以便输入的数量从 50 倍增加到 100 倍.虽然网络可能在您的数据集上得到了很好的优化,但它永远无法预测未来不可预测的行为 ;)

If you want the network to achieve higher accuracy, I would recommend you to also feed the network the stock values from the previous year (at the exact same date), so that the number of inputs doubles from 50 to 100. Though the network might be well optimised on your dataset, it will never be able to predict the unpredictable behaviour of the future ;)

这篇关于在时间序列数据集上使用循环神经网络进行预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆