如何在Windows上的R中将蒙特卡洛用于ARIMA仿真功能 [英] How to use Monte Carlo for ARIMA Simulation Function in R on Windows

查看:57
本文介绍了如何在Windows上的R中将蒙特卡洛用于ARIMA仿真功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我要使用 R 的算法:

  1. 通过 arima.sim()函数从 ARIMA 模型中模拟10个时间序列数据集
  2. 将系列划分为可能的 2s 3s 4s 5s 6s 7s 8s 9s .
  3. 对于每种大小,请对块进行重新采样并进行替换,以获取新系列,并通过 auto.arima()从每种块大小的子系列中获得最佳的 ARIMA 模型功能.
  4. 获取每个块大小 RMSE 的每个子系列.
  1. Simulate 10 time series data set from ARIMA model through arima.sim() function
  2. Split the series into sub-series of possible 2s, 3s, 4s, 5s, 6s, 7s, 8s, and 9s.
  3. For each size take a resample the blocks with replacement, for new series and obtain the best ARIMA model from the subseries from each block size through auto.arima() function.
  4. Obtain for each subseries of each block sizes RMSE.

下面的 R 函数可以完成该操作.

The below R function get that done.

## Load packages and prepare multicore process
library(forecast)
library(future.apply)
plan(multisession)
library(parallel)
library(foreach)
library(doParallel)
n_cores <- detectCores()
cl <- makeCluster(n_cores)
registerDoParallel(cores = detectCores())
## simulate ARIMA(1,0, 0)
#n=10; phi <- 0.6; order <- c(1, 0, 0)
bootstrap1 <- function(n, phi){
  ts <- arima.sim(n, model = list(ar=phi, order = c(1, 0, 0)), sd = 1)
  ########################################################
  ## create a vector of block sizes
  t <- length(ts)    # the length of the time series
  lb <- seq(n-2)+1   # vector of block sizes to be 1 < l < n (i.e to be between 1 and n exclusively)
  ########################################################
  ## This section create matrix to store block means
  BOOTSTRAP <- matrix(nrow = 1, ncol = length(lb))
  colnames(BOOTSTRAP) <-lb
  ########################################################
  ## This section use foreach function to do detail in the brace
  BOOTSTRAP <- foreach(b = 1:length(lb), .combine = 'cbind') %do%{
    l <- lb[b]# block size at each instance 
    m <- ceiling(t / l)                                 # number of blocks
    blk <- split(ts, rep(1:m, each=l, length.out = t))  # divides the series into blocks
    ######################################################
    res<-sample(blk, replace=T, 10)        # resamples the blocks
    res.unlist <- unlist(res, use.names = FALSE)   # unlist the bootstrap series
    train <- head(res.unlist, round(length(res.unlist) - 10)) # Train set
    test <- tail(res.unlist, length(res.unlist) - length(train)) # Test set
    nfuture <- forecast::forecast(train, model = forecast::auto.arima(train), lambda=0, biasadj=TRUE, h = length(test))$mean        # makes the `forecast of test set
    RMSE <- Metrics::rmse(test, nfuture)      # RETURN RMSE
    BOOTSTRAP[b] <- RMSE
  }
  BOOTSTRAPS <- matrix(BOOTSTRAP, nrow = 1, ncol = length(lb))
  colnames(BOOTSTRAPS) <- lb
  BOOTSTRAPS
  return(list(BOOTSTRAPS))
}

调用函数

bootstrap1(10, 0.6)

我得到以下结果:

##              2        3         4        5        6        7         8         9
##  [1,] 0.8920703 0.703974 0.6990448 0.714255 1.308236 0.809914 0.5315476 0.8175382

我想按时间顺序重复上述第1步第4步,然后我想到 R <中的 Monte Carlo 技术/code>.因此,我加载了它的包并运行以下功能:

I want to repeat the above step 1 to step 4 chronologically, then I think of Monte Carlo technology in R. Thus, I load its package and run the below function:

param_list=list("n"=10, "phi"=0.6)
library(MonteCarlo)
MC_result<-MonteCarlo(func = bootstrap1, nrep=3, param_list = param_list)

期望以 matrix 形式获得以下结果:

expecting to get a like of the below result in matrix form:

##           [,2]     [,3]      [,4]    [,5]       [,6]      [,7]      [,8]      [,9]
##  [1,] 0.8920703 0.703974  0.6990448 0.714255  1.308236  0.809914  0.5315476 0.8175382
##  [2,] 0.8909836 0.8457537 1.095148  0.8918468 0.8913282 0.7894167 0.8911484 0.8694729
##  [3,] 1.586785  1.224003  1.375026  1.292847  1.437359  1.418744  1.550254  1.30784

但是我收到以下错误消息:

but I get the following error message:

MonteCarlo中的错误(func = bootstrap1,nrep = 3,param_list = param_list):func必须返回包含命名组件的列表.每个分量都必须是标量的.

Error in MonteCarlo(func = bootstrap1, nrep = 3, param_list = param_list) : func has to return a list with named components. Each component has to be scalar.

我如何找到获得上述期望结果并使结果可再现的方式?

编辑

我希望可以在 Windows

推荐答案

您收到此错误消息是因为MonteCarlo期望 bootstrap1()接受一个参数组合进行仿真并且每次复制仅返回一个值( RMSE ).这里不是这种情况,因为块长度( lb )由模拟时间序列( n )的长度确定( in bootstrap1 ,这样您将获得每个调用 n-2 个块长度的结果.

You get this error message because MonteCarlo expects bootstrap1() to accept one parameter combination for the simulation and that it only returns one value (RMSE) per replication. This is not the case here since the block length (lb) is determined by the length of the simulated time series (n) within bootstrap1 and so you will get results for n - 2 block lengths for each call.

一种解决方案是将块长度作为参数传递,并适当地重写 bootstrap1():

A solution is to pass the block length as a parameter and rewrite bootstrap1() appropriately:

library(MonteCarlo)
library(forecast)
library(Metrics)

# parameter grids
n <- 10 # length of time series
lb <- seq(n-2) + 1 # vector of block sizes
phi <- 0.6 # autoregressive parameter
reps <- 3 # monte carlo replications

# simulation function  
bootstrap1 <- function(n, lb, phi) {
    
    #### simulate ####
    ts <- arima.sim(n, model = list(ar = phi, order = c(1, 0, 0)), sd = 1)
    
    #### devide ####
    m <- ceiling(n / lb) # number of blocks
    blk <- split(ts, rep(1:m, each = lb, length.out = n)) # divide into blocks
    #### resample ####
    res <- sample(blk, replace = TRUE, 10)        # resamples the blocks
    res.unlist <- unlist(res, use.names = FALSE)   # unlist the bootstrap series
    #### train, forecast ####
    train <- head(res.unlist, round(length(res.unlist) - 10)) # train set
    test <- tail(res.unlist, length(res.unlist) - length(train)) # test set
    nfuture <- forecast(train, # forecast
                        model = auto.arima(train), 
                        lambda = 0, biasadj = TRUE, h = length(test))$mean    
    ### metric ####
    RMSE <- rmse(test, nfuture) # return RMSE
    return(
      list("RMSE" = RMSE)
    )
}

param_list = list("n" = n, "lb" = lb, "phi" = phi)

要运行模拟,请将参数以及 bootstrap1()传递给 MonteCarlo().为了并行进行仿真,您需要通过 ncpus 设置内核数.MonteCarlo软件包使用snowFall,因此应在Windows上运行.

To run the simulation, pass the parameters as well as bootstrap1() to MonteCarlo(). For the simulation to be carried out in parallel you need to set the number of cores via ncpus. The MonteCarlo package uses snowFall, so it should run on Windows.

请注意,我还设置了 raw = T (否则结果将是所有复制的平均值).事先设置种子将使结果具有可重复性.

Note that I also set raw = T (otherwise the outcomes would be averages over all replications). Setting the seed before will make the results reproducible.

set.seed(123)
MC_result <- MonteCarlo(func = bootstrap1, 
                        nrep = reps,
                        ncpus = parallel::detectCores() - 1,
                        param_list = param_list,
                        export_also = list(
                         "packages" = c("forecast", "Metrics")
                        ),
                        raw = T)

结果是一个数组.我认为最好通过 MakeFrame()将其转换为data.frame:

The result is an array. I think it's best to transform it into a data.frame via MakeFrame():

Frame <- MakeFrame(MC_result)

尽管很容易获得 reps x lb 矩阵:

matrix(Frame$RMSE, ncol = length(lb), dimnames = list(1:reps, lb))

这篇关于如何在Windows上的R中将蒙特卡洛用于ARIMA仿真功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆