基于RNN的非线性多元时间序列响应预测 [英] Non-linear multivariate time-series response prediction using RNN

查看:137
本文介绍了基于RNN的非线性多元时间序列响应预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑到室内和室外的气候,我试图预测墙体的湿热响应.根据文献研究,我相信RNN可以实现这一点,但我一直无法获得很好的准确性.

I am trying to predict the hygrothermal response of a wall, given the interior and exterior climate. Based on literature research, I believe this should be possible with RNN but I have not been able to get good accuracy.

数据集具有12个输入要素(外部和内部气候数据的时间序列)和10个输出要素(湿热响应的时间序列),均包含10年的小时值.该数据是使用湿热模拟软件创建的,没有丢失的数据.

The dataset has 12 input features (time-series of exterior and interior climate data) and 10 output features (time-series of hygrothermal response), both containing hourly values for 10 years. This data was created with hygrothermal simulation software, there is no missing data.

数据集功能:

数据集目标:

与大多数时间序列预测问题不同,我想在每个时间步长预测输入要素时间序列的全长响应,而不是时间序列的后续值(例如财务时间序列)预言).我无法找到类似的预测问题(在相似或其他领域),因此,如果您知道其中一个,将非常欢迎您参考.

Unlike most time-series prediction problems, I want to predict the response for the full length of the input features time-series at each time-step, rather than the subsequent values of a time-series (eg financial time-series prediction). I have not been able to find similar prediction problems (in similar or other fields), so if you know of one, references are very welcome.

我认为RNN应该可以实现,因此我目前正在使用Keras的LSTM.在训练之前,我会通过以下方式预处理我的数据:

I think this should be possible with RNN, so I am currently using LSTM from Keras. Before training, I preprocess my data the following way:

  1. 丢弃第一年的数据,因为墙体的湿热响应的最初步骤受初始温度和相对湿度的影响.
  2. 分为训练和测试集.训练集包含前8年的数据,测试集包含余下的2年.
  3. 使用Sklearn中的StandardScaler归一化训练集(零均值,单位方差).类似地使用均值与训练集的方差归一化测试集.
  1. Discard first year of data, as the first time steps of the hygrothermal response of the wall is influenced by the initial temperature and relative humidity.
  2. Split into training and testing set. Training set contains the first 8 years of data, the test set contains the remaining 2 years.
  3. Normalise training set (zero mean, unit variance) using StandardScaler from Sklearn. Normalise test set analogously using mean an variance from training set.

结果为:X_train.shape = (1, 61320, 12)y_train.shape = (1, 61320, 10)X_test.shape = (1, 17520, 12)y_test.shape = (1, 17520, 10)

由于这些都是很长的时间序列,因此我使用有状态LSTM并按

As these are long time-series, I use stateful LSTM and cut the time-series as explained here, using the stateful_cut() function. I only have 1 sample, so batch_size is 1. For T_after_cut I have tried 24 and 120 (24*5); 24 appears to give better results. This results in X_train.shape = (2555, 24, 12), y_train.shape = (2555, 24, 10), X_test.shape = (730, 24, 12), y_test.shape = (730, 24, 10).

接下来,我按照以下步骤构建和训练LSTM模型:

Next, I build and train the LSTM model as follows:

model = Sequential()
model.add(LSTM(128, 
               batch_input_shape=(batch_size,T_after_cut,features), 
               return_sequences=True,
               stateful=True,
               ))
model.addTimeDistributed(Dense(targets)))
model.compile(loss='mean_squared_error', optimizer=Adam())

model.fit(X_train, y_train, epochs=100, batch_size=batch=batch_size, verbose=2, shuffle=False)

不幸的是,我没有得到准确的预测结果;即使对于训练集也是如此,因此该模型具有较高的偏差.

Unfortunately, I don't get accurate prediction results; not even for the training set, thus the model has high bias.

所有目标的LSTM模型的预测结果

如何改善我的模型?我已经尝试了以下方法:

How can I improve my model? I have already tried the following:

  1. 不丢弃数据集的第一年->没有显着差异
  2. 区分输入特征的时间序列(从当前值中减去先前的值)->结果稍差
  3. 多达四个堆叠的LSTM层,都具有相同的超参数->结果无明显差异,但训练时间更长
  4. 在LSTM层之后退出层(尽管通常用于减少方差,并且我的模型具有较高的偏差)->结果略好,但差异可能不具有统计意义

我对有状态LSTM做错了什么吗?我需要尝试不同的RNN模型吗?我应该对数据进行不同的预处理吗?

Am I doing something wrong with the stateful LSTM? Do I need to try different RNN models? Should I preprocess the data differently?

此外,培训非常缓慢:上述模型大约需要4个小时.因此,我不愿意进行广泛的超参数网格搜索...

Furthermore, training is very slow: about 4 hours for the model above. Hence I am reluctant to do an extensive hyperparameter gridsearch...

推荐答案

最后,我设法通过以下方式解决了这个问题:

In the end, I managed to solve this the following way:

  • 使用更多的样本进行训练,而不是仅使用1个样本(我使用18个样本进行训练,使用6个样本进行测试)
  • 保留第一年的数据,因为所有样本的输出时间序列具有相同的起点",并且该模型需要此信息来学习
  • 标准化输入和输出特征(零均值,单位方差).我发现这提高了预测准确性和训练速度
  • 按照此处所述使用有状态LSTM,但在新纪元后添加重置状态(请参见下面的代码).我使用了batch_size = 6T_after_cut = 1460.如果T_after_cut较长,则训练会较慢;如果T_after_cut较短,则精度会略有下降.如果有更多样本可用,我认为使用更大的batch_size会更快.
  • 使用CuDNNLSTM代替LSTM,这可以将训练时间缩短4倍!
  • 我发现,更多的单元会导致更高的准确性和更快的收敛(更短的训练时间).我还发现,对于相同数量的单位,GRU的准确性与LSTM难题的收敛速度一样快.
  • 在培训过程中监控验证丢失,并且使用了提前停止
  • Using more samples to train instead of only 1 (I used 18 samples to train and 6 to test)
  • Keep the first year of data, as the output time-series for all samples have the same 'starting point' and the model needs this information to learn
  • Standardise both input and output features (zero mean, unit variance). I found this improved prediction accuracy and training speed
  • Use stateful LSTM as described here, but add reset states after epoch (see below for code). I used batch_size = 6 and T_after_cut = 1460. If T_after_cut is longer, training is slower; if T_after_cut is shorter, accuracy decreases slightly. If more samples are available, I think using a larger batch_size will be faster.
  • use CuDNNLSTM instead of LSTM, this speed up the training time x4!
  • I found that more units resulted in higher accuracy and faster convergence (shorter training time). Also I found that the GRU is as accurate as the LSTM tough converged faster for the same number of units.
  • Monitor validation loss during training and use early stopping

LSTM模型的构建和训练如下:

The LSTM model is build and trained as follows:

def define_reset_states_batch(nb_cuts):
  class ResetStatesCallback(Callback):
    def __init__(self):
      self.counter = 0

    def on_batch_begin(self, batch, logs={}):
    # reset states when nb_cuts batches are completed
      if self.counter % nb_cuts == 0:
        self.model.reset_states()
      self.counter += 1

    def on_epoch_end(self, epoch, logs={}):
    # reset states after each epoch
      self.model.reset_states()
      return(ResetStatesCallback)    

model = Sequential()
model.add(layers.CuDNNLSTM(256, batch_input_shape=(batch_size,T_after_cut ,features),
  return_sequences=True,
  stateful=True))
model.add(layers.TimeDistributed(layers.Dense(targets, activation='linear')))

optimizer = RMSprop(lr=0.002)
model.compile(loss='mean_squared_error', optimizer=optimizer)

earlyStopping = EarlyStopping(monitor='val_loss', min_delta=0.005, patience=15, verbose=1, mode='auto')
ResetStatesCallback = define_reset_states_batch(nb_cuts)
model.fit(X_dev, y_dev, epochs=n_epochs, batch_size=n_batch, verbose=1, shuffle=False, validation_data=(X_eval,y_eval), callbacks=[ResetStatesCallback(), earlyStopping])

这给了我很高的统计准确性(R2超过0.98): 该图显示了两年内墙壁的温度(左)和相对湿度(右)(训练中未使用的数据),红色为预测值,黑色为真实输出.残差表明误差很小,并且LSTM学会了捕获长期依赖关系以预测相对湿度.

This gave me very statisfying accuracy (R2 over 0.98): This figure shows the temperature (left) and relative humidity (right) in the wall over 2 years (data not used in training), prediction in red and true output in black. The residuals show that the error is very small and that the LSTM learns to capture the long-term dependencies to predict the relative humidity.

这篇关于基于RNN的非线性多元时间序列响应预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆