在 R 中使用 RNN (Keras) 进行时间序列预测 [英] Time Series prediction using RNNs (Keras) in R

查看:91
本文介绍了在 R 中使用 RNN (Keras) 进行时间序列预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在关注 Chollet 的深度学习与 R 方法(将 RNN 拟合到时间序列数据) 用于拟合 RNN 进行时间序列预测.

I was following the Chollet's Deep learning with R approach (fitting RNNs to time series data) for fitting RNNs for time series prediction.

model <- keras_model_sequential() %>% 
  layer_gru(units = 32, 
            dropout = 0.1, 
            recurrent_dropout = 0.5,
            return_sequences = TRUE,
            input_shape = list(NULL, dim(data)[[-1]])) %>% 
  layer_gru(units = 64, activation = "relu",
            dropout = 0.1,
            recurrent_dropout = 0.5) %>% 
  layer_dense(units = 1)

model %>% compile(
  optimizer = optimizer_rmsprop(),
  loss = "mae"
)

history <- model %>% fit_generator(
  train_gen,
  steps_per_epoch = 500,
  epochs = 40,
  validation_data = val_gen,
  validation_steps = val_steps
)

此处,训练、验证和测试数据使用以下方法生成:

Here, train, validation and tests data are generated using:

lookback <- 1440
step <- 6
delay <- 144
batch_size <- 128

train_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 1,
  max_index = 200000,
  shuffle = TRUE,
  step = step, 
  batch_size = batch_size
)

val_gen = generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 200001,
  max_index = 300000,
  step = step,
  batch_size = batch_size
)

test_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 300001,
  max_index = NULL,
  step = step,
  batch_size = batch_size
)

# How many steps to draw from val_gen in order to see the entire validation set
val_steps <- (300000 - 200001 - lookback) / batch_size

# How many steps to draw from test_gen in order to see the entire test set
test_steps <- (nrow(data) - 300001 - lookback) / batch_size

在此之后,我阅读了Keras文档并找到了预测功能.要找到对测试数据的预测:

After this, I have read the Keras documentation and found the prediction function. To find the prediction on test data:

m <- model %>% evaluate_generator(test_gen, steps = test_steps)
m

然而,它只给出了测试数据的损失值.

However, It giving only loss value for test data.

我的问题是,如何获得测试数据集中每个点的预测,就像我们在其他时间序列方法中可以获得的一样?如何绘制这些预测值和实际值?

My question is, how to get the predictions for each point in test dataset, like we can get in other time series method? How to plot these predicted and actual values?

推荐答案

在我看来你需要重新定义generator,你只需要得到samples作为输出.按照你的例子:

It seems to me that you need to redefine the generator, you need to get only the samples as output. Following your example:

# generator function
generator <- function(data, lookback, delay, min_index, max_index,
                      shuffle = FALSE, batch_size = 128, step = 6) {
  if (is.null(max_index))
    max_index <- nrow(data) - delay - 1
  i <- min_index + lookback
  function() {
    if (shuffle) {
      rows <- sample(c((min_index+lookback):max_index), size = batch_size)
    } else {
      if (i + batch_size >= max_index)
        i <<- min_index + lookback
      rows <- c(i:min(i+batch_size-1, max_index))
      i <<- i + length(rows)
    }

    samples <- array(0, dim = c(length(rows), 
                                lookback / step,
                                dim(data)[[-1]]))
    targets <- array(0, dim = c(length(rows)))

    for (j in 1:length(rows)) {
      indices <- seq(rows[[j]] - lookback, rows[[j]]-1, 
                     length.out = dim(samples)[[2]])
      samples[j,,] <- data[indices,]
      targets[[j]] <- data[rows[[j]] + delay,2]
    }            

    list(samples) # just the samples, (quick and dirty solution, I just removed targets)
  }
}

# test_gen is the same
test_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 300001,
  max_index = NULL,
  step = step,
  batch_size = batch_size
)

现在你可以调用predict_generator:

preds <- model %>% predict_generator(test_gen, steps = test_steps)

但是现在您需要去标准化那些,因为您在拟合之前缩放了每个变量.

But now you need to de-normalize those, because you scaled each variable before the fit.

denorm_pred = preds * std + mean

注意 stdmean 应该在 T (degC) only 上计算 >train 数据,否则你会过度拟合.

Be careful that std and mean should be calculated on T (degC) only on the train data, otherwise you're overfitting.

这篇关于在 R 中使用 RNN (Keras) 进行时间序列预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆