两种布朗运动之间的相关性的蒙特卡洛模拟(连续随机游动) [英] Monte Carlo simulation of correlation between two Brownian motion (continuous random walk)

查看:445
本文介绍了两种布朗运动之间的相关性的蒙特卡洛模拟(连续随机游动)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

y <- cumsum(rnorm(100,0,1)) # random normal, with small (1.0) drift.
y.ts <- ts(y)
x <- cumsum(rnorm(100,0,1))
x
x.ts <- ts(x)
ts.plot(y.ts,ty= "l", x.ts) # plot the two random walks 


Regression.Q1 = lm(y~x) ; summary(lm2) 
summary(Regression.Q1)

t.test1 <- (summary(Regression.Q1)$coef[2,3]) # T-test computation 


y[t] = y[t-1] + epsilon[t]
epsilon[t] ~ N(0,1)
set.seed(1)
t=1000
epsilon=sample(c(-1,1), t, replace = 1) # Generate k random walks across time {0, 1, ... , T}


N=T=1e3
y=t(apply(matrix(sample(c(-1,1),N*T,rep=TRUE),ncol=T),1,cumsum))
y[1]<-0
for (i in 2:t) {
  y[i]<-y[i-1]+epsilon[i]
}

我需要:

重复该过程1000次(Monte Carlo模拟),即围绕先前的程序构建一个循环,并每次保存t统计信息.您将具有1; 000个t检验的序列:S =(t-test1,t-test2,...,t-test1000).计算1,000次t检验的绝对值> 1.96的时间次数,该临界值在5%的显着性水平.如果系列是I(0),您将发现大约5%.这里不是这种情况(虚假回归).

Repeat the process 1,000 times (Monte Carlo simulations), namely build a loop around the previous program and each time save the t statistics. You will have a sequence of 1;000 t-tests : S = (t-test1, t-test2, ... , t-test1000). Count the number of time the absolute value of the 1,000 t-tests > 1.96, the critical value at a 5% significance level. If the series were I(0) you would have found roughly 5%. It won't be the case here (spurious regression).

我需要添加什么来保存各个系数?

What do I need to add to save the respective coefficients ?

推荐答案

您发布的与y[t] = y[t-1] + epsilon[t]相关的代码不是真正的工作代码,但是我可以看到您正在尝试存储所有1000 * 2随机游动.实际上,没有必要这样做.我们只关心t分数,而不是那些随机游走的实现是什么.

Your posted code related to y[t] = y[t-1] + epsilon[t] is not real working code, but I can see that you are trying to store all 1000 * 2 random walk. Actually there is no need to do this. We only care about t-score rather than what those realizations of random walk are.

对于这类问题,我们旨在多次复制一个过程,因此首先编写一个函数来一次执行这样一个过程很方便.您已经为此拥有了良好的工作代码.我们只需要将它包装在一个函数中(删除那些不必要的部分,例如plot):

For this kind of problem, where we aim to replicate a procedure a lot of times, it is handy to first write a function to execute such a procedure for a single time. You already had good working code for this; we just need to wrap it in a function (removing those unnecessary part like plot):

sim <- function () {
  y <- cumsum(rnorm(100,0,1))
  x <- cumsum(rnorm(100,0,1))
  coef(summary(lm(y ~ x)))[2,3]
  }

此功能不需输入;它只会返回一个实验的t分数.

This function takes no input; it only returns the t-score for one experiment.

现在,我们将重复1000次.我们可以编写一个for循环,但是函数replicate更容易(如有必要,请阅读?replicate)

Now, we are going to repeat this 1000 times. We can write a for loop, but function replicate is easier (read ?replicate if necessary)

S <- replicate(1000, sim())

请注意,这将花费一些时间,比执行此简单任务要慢得多,因为lmsummary.lm都很慢.稍后将显示一种更快的方法.

Note this will take some time, much slower than it should be for such a simple task, because both lm and summary.lm are slow. A much faster way will be shown later.

现在,S是具有1000个值的向量,这是您想要的"1000个t检验的序列" .要获得"1,000次t检验的绝对值> 1.96的次数" ,我们可以使用

Now, S is vector with 1000 values, which is the "a sequence of 1000 t-tests" you want. To get "the number of time the absolute value of the 1,000 t-tests > 1.96", we can just use

sum(abs(S) > 1.96)
# [1] 756

结果756就是我得到的;因为模拟是随机的,所以您会得到一些不同的东西.但这将总是如预期的那样大.

The result 756 is just what I get; you will get something different as the simulation is random. But it will always be quite a large number as expected.

sim的更快版本:

A faster version of sim:

fast_sim <- function () {
  y <- cumsum(rnorm(100,0,1))
  x <- cumsum(rnorm(100,0,1))
  y <- y - mean(y)
  x <- x - mean(x)
  xty <- crossprod(x,y)[1]
  xtx <- crossprod(x)[1]
  b <- xty / xtx
  sigma <- sqrt(sum((y - x * b) ^ 2) / 98)
  b * sqrt(xtx) * sigma
  }

此函数在不使用lm的情况下计算简单的线性回归,而在不使用summary.lm的情况下计算t分数.

This function computes simple linear regression without lm, and t-score without summary.lm.

S <- replicate(1000, fast_sim())
sum(abs(S) > 1.96)
# [1] 778

另一种方法是使用cor.test:

fast_sim2 <- function () {
  y <- cumsum(rnorm(100,0,1))
  x <- cumsum(rnorm(100,0,1))
  unname(cor.test(x, y)[[1]])
  }

S <- replicate(1000, fast_sim())
sum(abs(S) > 1.96)
# [1] 775

让我们有一个基准:

system.time(replicate(1000, sim()))
#   user  system elapsed 
#  1.860   0.004   1.867 

system.time(replicate(1000, fast_sim()))
#   user  system elapsed 
#  0.088   0.000   0.090 

system.time(replicate(1000, fast_sim2()))
#   user  system elapsed 
#  0.308   0.004   0.312 

cor.testlm + summary.lm快得多,但是手动计算甚至更快!

cor.test is much faster than lm + summary.lm, but manual computation is even faster!

这篇关于两种布朗运动之间的相关性的蒙特卡洛模拟(连续随机游动)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆