LSTM的时间序列输入 [英] Timeseries input to an LSTM

查看:191
本文介绍了LSTM的时间序列输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,其中包含从不同位置收集的水样.例如,ABC1水样本取自亚利桑那州的一条河,而ABC2是水样本取自波士顿的一条河.它们都是河流,它们具有相同的特征列(pH,温度等),但是它们位于不同的位置,因此特征的变化对它们而言是独特的.因此,我的目标是创建一个河流模型,因为我没有足够的数据来创建单个模型.我总共要预测11个月的值.我的数据集看起来像这样:

I have dataset containing water samples collected from different locations. For example, ABC1 water sample is taken from a river in Arizona and ABC2 is a water sample taken from a river in Boston. They are both rivers, they have the same feature columns(pH, temp, etc...) but they are in different locations so the changes in features are individual to them. So my goal is to create one river model because I do not have enough data to create individual models. There are total 11 columns that I want to predict next months values. My dataset looks like this:

Date         Sample_Name        pH    temp    etc...

2009-01-01    ABC1              7.2    12
2009-01-02    ABC2              5.5    11
.
.
2009-01-02    ABC1              7.2    10
2009-01-02    ABC2              7.3    10
.
.
2013-06-02    ABC2              6.5    22
2013-06-04    ABC1              6.5    22
.
2015-01-05    ABC1              8.9    13
2015-01-05    ABC4              8.8    13

我想将每个样本及其序列输入LSTM模型.例如;ABC1的每次测量(行)都必须作为序列或批次提供给模型.是否可以使用TimeseriesGenerator进行这种数据准备?如何按照我描述的方式准备数据以将其提供给模型?还有助于按日期和样品名称(字母顺序)对数据集进行排序吗?我正在尝试实现这样的目标

I want to feed every sample and its sequence to an LSTM model. For example; every measurement(row) of ABC1 must be given to a model as a sequence, or a batch. Is it possible to do this kind of data preparation using TimeseriesGenerator? How can I prepare my data in a way to feed it to the model as I described? Also does it help to sort the dataset with date and sample name(alphabetically)? I am trying to achieve something like this

我想使用以下方法生成数据:

I want to generate data using:

from keras.preprocessing.sequence import TimeseriesGenerator
n_timesteps = 2
n_features = 10
batch_size = 5
generator = TimeseriesGenerator(df, df, length, sampling_rate = 10, stride = 1, batch_size = batch_size)

我要在其中输入数据的简单LSTM模型:

The simple LSTM model that I want to feed my data in:

from keras.models import Sequential
from keras.layers import LSTM, Dense
from keras.utils import Sequence

model = Sequential()
model.add(LSTM(n_features, activation='relu', input_shape=(n_timesteps, n_features)))
model.add(Dense(10))
model.compile(optimizer='adam', loss='mse', metrics = ['accuracy'])

推荐答案

查看文档,tf.keras.preprocessing.sequence.TimeseriesGenerator不能将字典作为第一个参数.切片"错误只是该事实的体现,因为该函数尝试使用第一个参数(dict)的切片而失败.再次来自文档:

Looking at the docs,tf.keras.preprocessing.sequence.TimeseriesGenerator cannot take a dictionary as the first argument. The 'slice' error is just a manifestation of that fact, as the function tries to use slices of the first argument (dict) and fails. again from the docs:

参数:数据:包含连续数据点(时间步长)的可索引生成器(例如列表或Numpy数组).

Arguments: data: Indexable generator (such as list or Numpy array) containing consecutive data points (timesteps).

所以也许您想传递 input_dict ['ABC1'] 或可能传递 input_dict ['ABC1'].values

so perhaps you want to pass input_dict['ABC1'] or possibly input_dict['ABC1'].values

这篇关于LSTM的时间序列输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆