LSTM通过将分类数据作为输入来预测数字数据 [英] LSTM to forecast numerical data by having categorical data as input

查看:52
本文介绍了LSTM通过将分类数据作为输入来预测数字数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个类似的 DataFrame :

df = pd.DataFrame([
{'date':'2021-01-15', 'value':145, 'label':'negative'},
{'date':'2021-01-16', 'value':144, 'label':'positive'},
{'date':'2021-01-17', 'value':147, 'label':'positive'},
{'date':'2021-01-18', 'value':146, 'label':'negative'},
{'date':'2021-01-19', 'value':155, 'label':'negative'},
{'date':'2021-01-20', 'value':157, 'label':'positive'},
{'date':'2021-01-21', 'value':158, 'label':'positive'},
{'date':'2021-01-22', 'value':157, 'label':'negative'},
{'date':'2021-01-23', 'value':157, 'label':'positive'},
{'date':'2021-01-24', 'value':152, 'label':'positive'}, 
{'date':'2021-01-25', 'value':159, 'label':'negative'},
{'date':'2021-01-26', 'value':162, 'label':'positive'},
{'date':'2021-01-27', 'value':160, 'label':'positive'},
{'date':'2021-01-28', 'value':153, 'label':'negative'},
{'date':'2021-01-29', 'value':149, 'label':'negative'},
{'date':'2021-01-30', 'value':156, 'label':'positive'},
{'date':'2021-01-31', 'value':168, 'label':'positive'},
{'date':'2021-02-01', 'value':179, 'label':'negative'},
{'date':'2021-02-02', 'value':184, 'label':'positive'},
{'date':'2021-02-03', 'value':189, 'label':'positive'},
{'date':'2021-02-04', 'value':196, 'label':'positive'}])

我已经将 date 列字符串转换为 datetime 格式,并使用 set_index 方法将其设置为索引.

I have already converted date column strings into datetime format and set it as index with set_index method.

一旦 n m 被修复,我想使用递归神经网络(LSTM)来预测最后的 n 个值.通过仅考虑 label 列的类别来生成 value 列.

Once n and m are fixed, I would like to use a Recurrent Neural Network (LSTM) to predict last n values of the value column by taking into account the categories of the label column only.

我刚刚使用以下代码对 label 列功能进行了编码:

I have just encoded label column features with the following:

from sklearn.preprocessing import OneHotEncoder
hot = OneHotEncoder(sparse = False).fit_transform(df.label.to_numpy().reshape(-1, 1))

和按比例缩放的数据:

from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range = (0, 1))
scaled = scaler.fit_transform(df.value.values)

但是我无法成功考虑 m n 条件来构建训练和测试集.

but I cannot succeed in taking into account m and n conditions to build train and test set.

有什么建议吗?

推荐答案

首先,您必须将数据集转换为LSTM支持的时间序列形式.建立一个仅预测第二天的模型,并将测试过程作为您希望从单个预测中得出的预测数量进行滚动.
您可以从此处

First of all, you have to transform the dataset into a time-series form that supported by LSTM. build a model to predict the next day only and roll the testing process as the number of predictions you want from a single prediction.
you can get complete from here

这篇关于LSTM通过将分类数据作为输入来预测数字数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆