R:在值>上过滤相关矩阵。和< [英] R: Filter correlation matrix on values > and <

查看:67
本文介绍了R:在值>上过滤相关矩阵。和<的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在R中编程,并且具有巨大的相关矩阵。我想过滤此矩阵,以使我只有包含值> 0.7或< -0.7的行和列。
我已经尝试过子集和过滤器,但是并没有真正得到我想要的。另一个问题是,行/列名称太多,我不想在它们上使用。
有人可以帮忙吗?

I am programming in R and have a huge correlation matrix. I would like to filter this matrix such that I only have rows and columns containing values >0.7 or <-0.7. I already tried subset and filter but don't really get what I want. The additional problem is that there are so many row/column names that I do not want to work on them. Can anybody please help?

例如

  1    2  3   4  
1 1    0  0.7 0.6  
2 0    1  0.6 0.6  
3 0.1  0  1   0.8  
4 -0.2 0  0.7 0.9  

应返回

  1    3    4   
1 1    0.7  0.6  
3 0.1  1    0.8  
4 -0.2 0.7  0.9


推荐答案

对角线为零,然后使用 apply(...,1,any)查找行(因此也找到行列(由于对称)(阈值大于等于阈值)。

Zero out the diagonal and use apply(..., 1, any) to find the rows (and therefore also the columns owing to symmetry) which have values >= threshold.

为了进行测试,如果问题中的矩阵为 cc ,那么我们使用 cor (cc) threshold = 0.6 而是因为问题中的 cc 不是相关性

For testing, if cc is the matrix in the question then we have used cor(cc) and threshold = 0.6 instead because cc in the question is not a correlation matrix.

cc <- matrix(c(1, 0, 0.1, -0.2, 0, 1, 0, 0, 0.7, 0.6, 1, 0.7, 0.6, 0.6, 0.8, 0.9), 4)
cc <- cor(cc)

threshold <- 0.6
cc0 <- cc
diag(cc0) <- 0
ok <- apply(abs(cc0) >= threshold, 1, any)
cc[ok, ok]

给予:

           [,1]       [,2]
[1,]  1.0000000 -0.6375997
[2,] -0.6375997  1.0000000

最后两行代码可以替换为它,它使用来获取条目的坐标> =阈值,其中(...,arr = TRUE)

The last two lines of code could alternately be replaced with this which gets the coordinates of the entries >= threshold using which(..., arr = TRUE)

ix <- sort(unique(c(which(abs(cc0) >= threshold, arr = TRUE))))
cc[ix, ix]

这篇关于R:在值&gt;上过滤相关矩阵。和&lt;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆