R:在值>上过滤相关矩阵。和< [英] R: Filter correlation matrix on values > and <
问题描述
我正在R中编程,并且具有巨大的相关矩阵。我想过滤此矩阵,以使我只有包含值> 0.7或< -0.7的行和列。
我已经尝试过子集和过滤器,但是并没有真正得到我想要的。另一个问题是,行/列名称太多,我不想在它们上使用。
有人可以帮忙吗?
I am programming in R and have a huge correlation matrix. I would like to filter this matrix such that I only have rows and columns containing values >0.7 or <-0.7. I already tried subset and filter but don't really get what I want. The additional problem is that there are so many row/column names that I do not want to work on them. Can anybody please help?
例如
1 2 3 4
1 1 0 0.7 0.6
2 0 1 0.6 0.6
3 0.1 0 1 0.8
4 -0.2 0 0.7 0.9
应返回
1 3 4
1 1 0.7 0.6
3 0.1 1 0.8
4 -0.2 0.7 0.9
推荐答案
对角线为零,然后使用 apply(...,1,any)
查找行(因此也找到行列(由于对称)(阈值大于等于阈值)。
Zero out the diagonal and use apply(..., 1, any)
to find the rows (and therefore also the columns owing to symmetry) which have values >= threshold.
为了进行测试,如果问题中的矩阵为 cc
,那么我们使用 cor (cc)
和 threshold = 0.6
而是因为问题中的 cc
不是相关性
For testing, if cc
is the matrix in the question then we have used cor(cc)
and threshold = 0.6
instead because cc
in the question is not a correlation matrix.
cc <- matrix(c(1, 0, 0.1, -0.2, 0, 1, 0, 0, 0.7, 0.6, 1, 0.7, 0.6, 0.6, 0.8, 0.9), 4)
cc <- cor(cc)
threshold <- 0.6
cc0 <- cc
diag(cc0) <- 0
ok <- apply(abs(cc0) >= threshold, 1, any)
cc[ok, ok]
给予:
[,1] [,2]
[1,] 1.0000000 -0.6375997
[2,] -0.6375997 1.0000000
最后两行代码可以替换为它,它使用来获取条目的坐标> =阈值,其中(...,arr = TRUE)
The last two lines of code could alternately be replaced with this which gets the coordinates of the entries >= threshold using which(..., arr = TRUE)
ix <- sort(unique(c(which(abs(cc0) >= threshold, arr = TRUE))))
cc[ix, ix]
这篇关于R:在值>上过滤相关矩阵。和<的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!