两种布朗运动之间的相关性的蒙特卡洛模拟(连续随机游动) [英] Monte Carlo simulation of correlation between two Brownian motion (continuous random walk)
问题描述
y <- cumsum(rnorm(100,0,1)) # random normal, with small (1.0) drift.
y.ts <- ts(y)
x <- cumsum(rnorm(100,0,1))
x
x.ts <- ts(x)
ts.plot(y.ts,ty= "l", x.ts) # plot the two random walks
Regression.Q1 = lm(y~x) ; summary(lm2)
summary(Regression.Q1)
t.test1 <- (summary(Regression.Q1)$coef[2,3]) # T-test computation
y[t] = y[t-1] + epsilon[t]
epsilon[t] ~ N(0,1)
set.seed(1)
t=1000
epsilon=sample(c(-1,1), t, replace = 1) # Generate k random walks across time {0, 1, ... , T}
N=T=1e3
y=t(apply(matrix(sample(c(-1,1),N*T,rep=TRUE),ncol=T),1,cumsum))
y[1]<-0
for (i in 2:t) {
y[i]<-y[i-1]+epsilon[i]
}
我需要:
重复该过程1000次(Monte Carlo模拟),即围绕先前的程序构建一个循环,并每次保存t统计信息.您将具有1; 000个t检验的序列:S =(t-test1,t-test2,...,t-test1000).计算1,000次t检验的绝对值> 1.96的时间次数,该临界值在5%的显着性水平.如果系列是I(0),您将发现大约5%.这里不是这种情况(虚假回归).
Repeat the process 1,000 times (Monte Carlo simulations), namely build a loop around the previous program and each time save the t statistics. You will have a sequence of 1;000 t-tests : S = (t-test1, t-test2, ... , t-test1000). Count the number of time the absolute value of the 1,000 t-tests > 1.96, the critical value at a 5% significance level. If the series were I(0) you would have found roughly 5%. It won't be the case here (spurious regression).
我需要添加什么来保存各个系数?
What do I need to add to save the respective coefficients ?
推荐答案
您发布的与y[t] = y[t-1] + epsilon[t]
相关的代码不是真正的工作代码,但是我可以看到您正在尝试存储所有1000 * 2随机游动.实际上,没有必要这样做.我们只关心t分数,而不是那些随机游走的实现是什么.
Your posted code related to y[t] = y[t-1] + epsilon[t]
is not real working code, but I can see that you are trying to store all 1000 * 2 random walk. Actually there is no need to do this. We only care about t-score rather than what those realizations of random walk are.
对于这类问题,我们旨在多次复制一个过程,因此首先编写一个函数来一次执行这样一个过程很方便.您已经为此拥有了良好的工作代码.我们只需要将它包装在一个函数中(删除那些不必要的部分,例如plot
):
For this kind of problem, where we aim to replicate a procedure a lot of times, it is handy to first write a function to execute such a procedure for a single time. You already had good working code for this; we just need to wrap it in a function (removing those unnecessary part like plot
):
sim <- function () {
y <- cumsum(rnorm(100,0,1))
x <- cumsum(rnorm(100,0,1))
coef(summary(lm(y ~ x)))[2,3]
}
此功能不需输入;它只会返回一个实验的t分数.
This function takes no input; it only returns the t-score for one experiment.
现在,我们将重复1000次.我们可以编写一个for
循环,但是函数replicate
更容易(如有必要,请阅读?replicate
)
Now, we are going to repeat this 1000 times. We can write a for
loop, but function replicate
is easier (read ?replicate
if necessary)
S <- replicate(1000, sim())
请注意,这将花费一些时间,比执行此简单任务要慢得多,因为lm
和summary.lm
都很慢.稍后将显示一种更快的方法.
Note this will take some time, much slower than it should be for such a simple task, because both lm
and summary.lm
are slow. A much faster way will be shown later.
现在,S
是具有1000个值的向量,这是您想要的"1000个t检验的序列" .要获得"1,000次t检验的绝对值> 1.96的次数" ,我们可以使用
Now, S
is vector with 1000 values, which is the "a sequence of 1000 t-tests" you want. To get "the number of time the absolute value of the 1,000 t-tests > 1.96", we can just use
sum(abs(S) > 1.96)
# [1] 756
结果756就是我得到的;因为模拟是随机的,所以您会得到一些不同的东西.但这将总是如预期的那样大.
The result 756 is just what I get; you will get something different as the simulation is random. But it will always be quite a large number as expected.
sim
的更快版本:
A faster version of sim
:
fast_sim <- function () {
y <- cumsum(rnorm(100,0,1))
x <- cumsum(rnorm(100,0,1))
y <- y - mean(y)
x <- x - mean(x)
xty <- crossprod(x,y)[1]
xtx <- crossprod(x)[1]
b <- xty / xtx
sigma <- sqrt(sum((y - x * b) ^ 2) / 98)
b * sqrt(xtx) * sigma
}
此函数在不使用lm
的情况下计算简单的线性回归,而在不使用summary.lm
的情况下计算t分数.
This function computes simple linear regression without lm
, and t-score without summary.lm
.
S <- replicate(1000, fast_sim())
sum(abs(S) > 1.96)
# [1] 778
另一种方法是使用cor.test
:
fast_sim2 <- function () {
y <- cumsum(rnorm(100,0,1))
x <- cumsum(rnorm(100,0,1))
unname(cor.test(x, y)[[1]])
}
S <- replicate(1000, fast_sim())
sum(abs(S) > 1.96)
# [1] 775
让我们有一个基准:
system.time(replicate(1000, sim()))
# user system elapsed
# 1.860 0.004 1.867
system.time(replicate(1000, fast_sim()))
# user system elapsed
# 0.088 0.000 0.090
system.time(replicate(1000, fast_sim2()))
# user system elapsed
# 0.308 0.004 0.312
cor.test
比lm
+ summary.lm
快得多,但是手动计算甚至更快!
cor.test
is much faster than lm
+ summary.lm
, but manual computation is even faster!
这篇关于两种布朗运动之间的相关性的蒙特卡洛模拟(连续随机游动)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!