构造相关变量 [英] Constructing correlated variables
问题描述
我有一个具有给定分布的变量(在下面的示例中为法线)。
I have a variable with a given distribution (normale in my below example).
set.seed(32)
var1 = rnorm(100,mean=0,sd=1)
我要创建与变量var1相关的变量(变量var2)的线性相关系数(大致或精确地)等于 Corr。 var1和var2之间的回归斜率应(大致或完全)等于1。
I want to create a variable (var2) that is correlated to var1 with a linear correlation coefficient (roughly or exactly) equals to "Corr". The slope of regression between var1 and var2 should (rougly or exactly) equals 1.
Corr = 0.3
我该如何实现?
我想做这样的事情:
decorelation = rnorm(100,mean=0,sd=1-Corr)
var2 = var1 + decorelation
当然,在运行时:
cor(var1,var2)
结果不接近Corr!
推荐答案
我前一阵子做了类似的事情。我要粘贴一些针对3个相关变量的代码,但可以很容易地将其推广到更复杂的东西。
I did something similar a while ago. I am pasting some code that is for 3 correlated variables but it can be easily generalized to something more complex.
首先创建一个F矩阵:
cor_Matrix <- matrix(c (1.00, 0.90, 0.20 ,
0.90, 1.00, 0.40 ,
0.20, 0.40, 1.00),
nrow=3,ncol=3,byrow=TRUE)
任意相关矩阵。
library(psych)
fit<-principal(cor_Matrix, nfactors=3, rotate="none")
fit$loadings
loadings<-matrix(fit$loadings[1:3, 1:3],nrow=3,ncol=3,byrow=F)
loadings
#create three rannor variable
cases <- t(replicate(3, rnorm(3000)) ) #edited, changed to 3000 cases from 150 cases
multivar <- loadings %*% cases
T_multivar <- t(multivar)
var<-as.data.frame(T_multivar)
cor(var)
同样,这可以推广。上面列出的方法不会创建多元数据集。
Again, this can be generalized. You approach listed above does not create a multivariate data set.
这篇关于构造相关变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!