R忽略空值的矩阵列的成对比较 [英] R Pairwise comparison of matrix columns ignoring empty values

查看:77
本文介绍了R忽略空值的矩阵列的成对比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数组,我希望借此获得每个列中之间相似度的度量.我的意思是,我希望比较数组的成对列之间的行,并在它们的值匹配时增加一个度量.然后,对于完全相同的两列,结果度量将最大.

I have an array for which I would like to obtain a measure of the similarity between values in each column. By which I mean I wish to compare the rows between pairwise columns of the array and increment a measure when their values match. The resulting measure would then be at a maximum for two columns exactly the same.

基本上,我的问题与此处讨论的问题相同: R:比较矩阵中的所有列都是成对的,只是我不希望对空单元格进行计数.

Essentially my problem is the same as discussed here: R: Compare all the columns pairwise in matrix except that I do not wish empty cells to be counted.

使用从链接页面派生的代码创建的示例数据:

With the example data created from code derived from the linked page:

data1 <- c("", "B", "", "", "")
data2 <- c("A", "", "", "", "")
data3 <- c("", "", "C", "", "A")
data4 <- c("", "", "", "", "")
data5 <- c("", "", "C", "", "A")
data6 <- c("", "B", "C", "", "")

my.matrix <- cbind(data1, data2, data3, data4, data5, data6)

similarity.matrix <- matrix(nrow=ncol(my.matrix), ncol=ncol(my.matrix))
for(col in 1:ncol(my.matrix)){
  matches <- my.matrix[,col] == my.matrix
  match.counts <- colSums(matches)
  match.counts[col] <- 0 
  similarity.matrix[,col] <- match.counts

}

我获得:

similarity.matrix =

    V1  V2  V3  V4  V5  V6
1   0   3   2   4   2   4
2   3   0   2   4   2   2
3   2   2   0   3   5   3
4   4   4   3   0   3   3
5   2   2   5   3   0   3
6   4   2   3   3   3   0

计算非值对.

我想要的输出将是:

expected.output =

    V1  V2  V3  V4  V5  V6
1   0   0   0   0   0   1
2   0   0   0   0   0   0
3   0   0   0   0   2   1
4   0   0   0   0   0   0
5   0   0   2   0   0   1
6   1   0   1   0   1   0

谢谢

马特

推荐答案

以下是akrun的答案:

So the following is the answer from akrun :

首先将空白单元格更改为NA's

first changing the blank cells to NA's

is.na(my.matrix) <- my.matrix==''

,然后删除match.counts

similarity.matrix <- matrix(nrow=ncol(my.matrix), ncol=ncol(my.matrix))

for(col in 1:ncol(my.matrix)){
  matches <- my.matrix[,col] == my.matrix
  match.counts <- colSums(matches, na.rm=TRUE)
  match.counts[col] <- 0 
  similarity.matrix[,col] <- match.counts

}

确实确实给了我想要的输出:

Which did indeed give me my desired output:

    V1  V2  V3  V4  V5  V6
1   0   0   0   0   0   1
2   0   0   0   0   0   0
3   0   0   0   0   2   1
4   0   0   0   0   0   0
5   0   0   2   0   0   1
6   1   0   1   0   1   0

谢谢.

这篇关于R忽略空值的矩阵列的成对比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆