定义一个计算相关矩阵的协方差矩阵的函数 [英] Defining a function that calculates the covariance-matrix of a correlation-matrix
问题描述
我在变换矩阵以及行和列的名称时遇到一些问题。
我的问题如下:
作为输入矩阵,我有一个(对称)相关矩阵,就像这样:
相关矢量由下三角矩阵的值给出:
方差-协方差矩阵:
方差可以用
-> N是样本大小(在此示例中N = 66)
协方差可以近似为
例如,r_02和r_13之间的协方差由
$给出b$ b
现在,我想在R中定义一个函数,该函数将相关矩阵作为输入并返回方差-协方差矩阵。但是,我在实现协方差的计算方面遇到问题。我的想法是给相关向量的元素命名,如上所示(r_01,r_02 ...)。然后,我想创建一个空的方差-协方差矩阵,它的长度为correlation_vector。行和列应与correlation_vector具有相同的名称,因此我可以通过[01] [03]来调用它们。然后,我要实现一个for循环,该循环将i和j以及k和l的值设置为协方差公式中所示的相关值,以作为我作为协方差公式输入的相关列和行。这些值必须始终是六个不同的值(ij; ik; il; jk; jl; lk)。这是我的主意,但是我现在不知道如何在R中实现它。
这是我的代码(不计算协方差):
require(corpcor)
related_matrix_input<-matrix(data = c(1.00,0.561,0.393,0.561, 0.561,1.00,0.286,0.549,0.393,0.286,1.00,0.286,0.561,0.549,0.286,1.00),ncol = 4,byrow = T)
N <-66#样本量
vector_of_correlations <-sm2vec(correlation_matrix_input,diag = F)#correlation_matrix_input的下三角矩阵
variance_covariance_matrix <-matrix(nrow = length(length(vector_of_correlations),ncol = length( vector_of_correlations))#创建空方差-协方差矩阵
#函数通过计算方差和协方差
variance_covariances <-函数来填充矩阵vector_of_correlations_input,sample_size){
for(i in(seq(along = vector_of_correlations_input))){
for(j in(seq(along = vector_ of_correlations_input))){
#如果(i == j){
variance_covariance_matrix [i,j] =(((1-vector_of_correlations_input [i ] ** 2)** 2)/ sample_size
}
#如果(i!= j){
variance_covariance_matrix [ i,j] = ???
}
}
}
return(variance_covariance_matrix);
}
任何人都有一个想法,如何使用来实现协方差的计算。上面显示的公式?
对于这个问题,我将不胜感激!
如果将 r
保留为矩阵,并使用此辅助函数使情况更清晰,则会更容易:
covr<-function(r,i,j,k,l,n){
if(i == k& j = = l)
return((1-r [i,j] ^ 2)^ 2 / n)
(0.5 * r [i,j] * r [k,l] *(r [ i,k] ^ 2 + r [i,l] ^ 2 + r [j,k] ^ 2 + r [j,l] ^ 2)+
r [i,k] * r [j,l ] + r [i,l] * r [j,k]-(r [i,j] * r [i,k] * r [i,l] +
r [j,i] * r [ j,k] * r [j,l] + r [k,i] * r [k,j] * r [k,l] + r [l,i] * r [l,j] * r [l ,k]))/ n
}
现在定义第二个函数:
vcovr<-函数(r,n){
p<-combn(nrow(r),2)
q<-seq(ncol(p))
外(q,q,Vectorize(函数(x,y)covr(r,p [1,x ],p [2,x],p [1,y],p [2,y],n)))
}
然后瞧:
> vcovr(correlation_matrix_input,66)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.007115262 0.001550264 0.002917481 0.003047666 0.003101602 0.001705781
[2,] 0.001550264 0.010832674 0.001550264 0.006109565 0.001127916 0.006109565
[3,] 0.002917481 0.001550264 0.007115262 0.001705781 0.003101602 0.003047666
[4,] 0.003047666 0.006109565 0.001705781 0.012774221 0.002036422 0.006625868 $ b $ 0.003101602 0.002036422 0.007394554 0.002036422
[6,] 0.001705781 0.006109565 0.003047666 0.006625868 0.002036422 0.012774221
EDIT:
对于转换后的Z值,如您的注释中所示,您可以使用以下代码:
covrZ<-function(r,i,j,k,l,n){
if(i == k&& j == l)
return(1 / (n-3))
covr(r,i,j,k,l,n)/((1-r [i,j] ^ 2)*(1-r [k,l] ^ 2 ))
}
只需将其替换为 vcovr
:
vcovrZ <-function(r,n){
p< -combn(nrow(r),2)
q<-seq(ncol(p))
external(q,q,Vectorize(function(x,y)covrZ(r,p [1, x],p [2,x],p [1,y],p [2,y],n)))
}
新结果:
> vcovrZ(correlation_matrix_input,66)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.015873016 0.002675460 0.006212598 0.004843517 0.006478743 0.002710920
[2,] 0.002675460 0.015873016 0.002675460 0.007869213 0.001909452 0.007869213
[3,] 0.006212598 0.002675460 0.015873016 0.002710920 0.006478743 0.004843517
[4,] 0.004843517 0.007869213 0.002710920 0.015873016 0.003174685 0.003858909 $ 897 0.006478743 0.003174685 0.015873016 0.003174685
[6,] 0.002710920 0.007869213 0.004843517 0.007858948 0.003174685 0.015873016
I have some problems with the transformation of a matrix and the names of the rows and columns.
My problem is as follows:
As input-matrix I have a (symmetric) correlation matrix like this one:
The correlation-vector is given by the values of the lower triangular matrix:
Now, I want to compute the variance-covariance-matrix of the these correlations, which are approximately normally distributed with the variance-covariance-matrix:
The variances can be approximated by
-> N is the sample size (in this example N = 66)
The covariances can be approximated by
For example the covariance between r_02 and r_13 is given by
Now, I want to define a function in R which gets the correlation matrix as input and returns the variance-covariance matrix. However, I have problems to implement the calculation of the covariances. My idea is to give names to the elements of the correlation_vector as shown above (r_01, r_02...). Then I want to create the empty variance-cocariance matrix, which has the length of the correlation_vector. The rows and the columns should have the same names as the correlation_vector, so I can call them for example by [01][03]. Then I want to implement a for-loop which sets the value of i and j as well as k and l as shown in the formula for the covariance to the columns and rows of the correlations that I need as input for the covariance-formula. These must always be six different values (ij; ik; il; jk; jl; lk). This is my idea, but I don't now how to implement this in R.
This is my code (without the calculation of the covariances):
require(corpcor)
correlation_matrix_input <- matrix(data=c(1.00,0.561,0.393,0.561,0.561,1.00,0.286,0.549,0.393,0.286,1.00,0.286,0.561,0.549,0.286,1.00),ncol=4,byrow=T)
N <- 66 # Sample Size
vector_of_correlations <- sm2vec(correlation_matrix_input, diag=F) # lower triangular matrix of correlation_matrix_input
variance_covariance_matrix <- matrix(nrow = length(vector_of_correlations), ncol = length(vector_of_correlations)) # creates the empty variance-covariance matrix
# function to fill the matrix by calculating the variance and the covariances
variances_covariances <- function(vector_of_correlations_input, sample_size) {
for (i in (seq(along = vector_of_correlations_input))) {
for (j in (seq(along = vector_of_correlations_input))) {
# calculate the variances for the diagonale
if (i == j) {
variance_covariance_matrix[i,j] = ((1-vector_of_correlations_input[i]**2)**2)/sample_size
}
# calculate the covariances
if (i != j) {
variance_covariance_matrix[i,j] = ???
}
}
}
return(variance_covariance_matrix);
}
Does anyone have an idea, how to implement the calculation of the covariances using the formula shown above?
I would be grateful for any kind of help regarding this problem!!!
It's easier if you keep r
as a matrix and use this helper function to make things clearer:
covr <- function(r, i, j, k, l, n){
if(i==k && j==l)
return((1-r[i,j]^2)^2/n)
( 0.5 * r[i,j]*r[k,l]*(r[i,k]^2 + r[i,l]^2 + r[j,k]^2 + r[j,l]^2) +
r[i,k]*r[j,l] + r[i,l]*r[j,k] - (r[i,j]*r[i,k]*r[i,l] +
r[j,i]*r[j,k]*r[j,l] + r[k,i]*r[k,j]*r[k,l] + r[l,i]*r[l,j]*r[l,k]) )/n
}
Now define this second function:
vcovr <- function(r, n){
p <- combn(nrow(r), 2)
q <- seq(ncol(p))
outer(q, q, Vectorize(function(x,y) covr(r, p[1,x], p[2,x], p[1,y], p[2,y], n)))
}
And voila:
> vcovr(correlation_matrix_input, 66)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.007115262 0.001550264 0.002917481 0.003047666 0.003101602 0.001705781
[2,] 0.001550264 0.010832674 0.001550264 0.006109565 0.001127916 0.006109565
[3,] 0.002917481 0.001550264 0.007115262 0.001705781 0.003101602 0.003047666
[4,] 0.003047666 0.006109565 0.001705781 0.012774221 0.002036422 0.006625868
[5,] 0.003101602 0.001127916 0.003101602 0.002036422 0.007394554 0.002036422
[6,] 0.001705781 0.006109565 0.003047666 0.006625868 0.002036422 0.012774221
EDIT:
For the transformed Z values, as in your comment, you can use this:
covrZ <- function(r, i, j, k, l, n){
if(i==k && j==l)
return(1/(n-3))
covr(r, i, j, k, l, n) / ((1-r[i,j]^2)*(1-r[k,l]^2))
}
And simply replace it in vcovr
:
vcovrZ <- function(r, n){
p <- combn(nrow(r), 2)
q <- seq(ncol(p))
outer(q, q, Vectorize(function(x,y) covrZ(r, p[1,x], p[2,x], p[1,y], p[2,y], n)))
}
New result:
> vcovrZ(correlation_matrix_input,66)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.015873016 0.002675460 0.006212598 0.004843517 0.006478743 0.002710920
[2,] 0.002675460 0.015873016 0.002675460 0.007869213 0.001909452 0.007869213
[3,] 0.006212598 0.002675460 0.015873016 0.002710920 0.006478743 0.004843517
[4,] 0.004843517 0.007869213 0.002710920 0.015873016 0.003174685 0.007858948
[5,] 0.006478743 0.001909452 0.006478743 0.003174685 0.015873016 0.003174685
[6,] 0.002710920 0.007869213 0.004843517 0.007858948 0.003174685 0.015873016
这篇关于定义一个计算相关矩阵的协方差矩阵的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!