具有多个变量的时间序列的递归神经网络-TensorFlow [英] Recurrent neural networks for Time Series with Multiple Variables - TensorFlow

查看:57
本文介绍了具有多个变量的时间序列的递归神经网络-TensorFlow的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 3个变量使用以前的需求来预测未来需求,但是每当我运行代码时,我的 Y轴都会显示错误

I'm using previous demand to predict future demand, using 3 variables, but whenever I run the code my Y axis shows error

如果我仅在 Y轴上仅使用一个变量,则没有错误.

If I use only one variable on the Y axis separately it has no error.

示例:

demandaY = bike_data[['cnt']]
n_steps = 20

for time_step in range(1, n_steps+1):
    demandaY['cnt'+str(time_step)] = demandaY[['cnt']].shift(-time_step).values

y = demandaY.iloc[:, 1:].values
y = np.reshape(y, (y.shape[0], n_steps, 1))

数据集

脚本

features = ['cnt','temp','hum']
demanda = bike_data[features]
n_steps = 20

for var_col in features:
    for time_step in range(1, n_steps+1):
        demanda[var_col+str(time_step)] = demanda[[var_col]].shift(-time_step).values

demanda.dropna(inplace=True)
demanda.head()

n_var = len(features)
columns = list(filter(lambda col: not(col.endswith("%d" % n_steps)), demanda.columns))

X = demanda[columns].iloc[:, :(n_steps*n_var)].values
X = np.reshape(X, (X.shape[0], n_steps, n_var))

y = demanda.iloc[:, 0].values
y = np.reshape(y, (y.shape[0], n_steps, 1))

输出

ValueError: cannot reshape array of size 17379 into shape (17379,20,1)

GitHub: 推荐答案

不清楚OP是否仍需要答案,但我将在注释中链接链接的答案并进行一些修改.

Not clear if the OP still wants the answer but I will post the answer I linked in the comment with a few modifications.

时间序列数据集可以具有不同的类型,让我们考虑一个以 X 作为特征而将 Y 作为标签的数据集.根据问题的不同, Y 可能是随时间推移的 X 中的样本,也可能是您要预测的另一个目标变量.

Timeseries datasets can be of different types, lets consider a dataset which has X as features and Y as labels. Depending on the problem Y might be a sample from X shifted in time or can also be another target variable you want to predict.

def create_dataset(X,Y, look_back=10, label_lag = -1, stride = 1):

    dataX, dataY = [], []

    for i in range(0,(len(X)-look_back + 1),stride):
        a = X[i:(i+look_back)]
        dataX.append(a)
        b = Y[i + look_back + label_lag]
        dataY.append(b)
    return np.array(dataX), np.array(dataY)

print(features.values.shape,labels.shape)
#(619,4), (619,1)

x,y = create_dataset(X=features.values,Y=labels.values,look_back=10,stride=1)
(x.shape,y.shape)
#(610, 10, 4), (610, 1)

使用其他参数:

  1. label_lag :如果 X 个采样是在 t 时间,则 Y 个采样将是在时间t + label_lag .默认值会将 X Y 置于相同的索引 t .
  1. label_lag : if X samples are at time t, Y samples will be at time t+label_lag. The default value will put both X and Yat same index t.

X Y 的第一个样本的索引:

the indices of 1st sample of X and Y:

if label_lag is -1:
np.where(x[1,-1]==features.values)[0],np.where(y[1] == labels.values)[0]
#(10,10,10,10), (10)

if label_lag is 0:
np.where(x[1,-1]==features.values)[0],np.where(y[1] == labels.values)[0]
#(10,10,10,10), (11)

  1. look_back :这是您当前时间步长 t 中数据集过去历史的样本数.look_back为10表示将在一个样本中包含从 t-10到t 的样本.

  1. look_back: this is the number of samples of past history of your dataset from your current timestep t. look_back of 10 means there will be samples from t-10 to t in one single sample.

stride :两个连续样本之间的索引间隔.当 stride = 2 时,如果 X 的第一个样本具有从索引 0到10 的行,则第二个样本将具有从索引的行2到12 .

stride : the index gap between two consecutive samples. When stride=2, If 1st sample of X has rows from index 0 to 10 then 2nd sample will have rows from the index 2 to 12.

此外,您还可以根据当前问题在 Y 中进行回溯,并且 Y 也可以是多维的.在这种情况下,更改仅是此 b = Y [i:(i + look_back + label_lag)] .

Furthermore, you can also have a lookback in Y depending on your current problem and Y can also be multi-dimensional. In that case the change is only this b=Y[i:(i+look_back+label_lag)].

相同的功能可以通过 keras 中的 TimeseriesGenerator 实现.

The same functionality can be achieved by TimeseriesGenerator from keras.

TimeseriesGenerator(features.values,labels.values,length=10,batch_size=64,stride=1)

其中 length look_back 相同.默认情况下, features labels 之间存在1的距离,即 X 中的样本将是 t-10到t Y 中的相应样本将位于索引 t + 1 .如果您希望两个索引都相同,则在传递生成器之前,只需将标签 shift 移1.

where length is same as look_back. By default there is a gap in features and labels by 1, i.e. a sample in X will be from t-10 to t and corresponding sample in Y will be at index t+1. If you want both at same indices just shiftthe labels by one before passing in the generator.

这篇关于具有多个变量的时间序列的递归神经网络-TensorFlow的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆