比较矩阵以找出差异 [英] Compare matrices to find the differences

查看:202
本文介绍了比较矩阵以找出差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个矩阵,我想比较它们(row.name明智)以找到差异。

 > head(N1)
Total_Degree Transitivity Betweenness Closeness_All
2410016O06RIK 1 NaN 0.00000 0.0003124024
AGO1 4 0.1666667 37.00000 0.0003133814
APEX1 4 0.6666667 4.00000 0.0003144654
ATR 4 0.1666667 19.50000 0.0003128911
CASP3 24 0.0000000 806.00000 0.0002980626
CCND2 4 0.3333333 97.33333 0.0003132832

head(N2)
Total_Degree Transitivity Betweenness Closeness_All
2410016O06RIK 1 NaN 0.0 2.279982e-04
ADI1 1 NaN 0.0 1.728877e-05
AGO1 3 0.0000000 40.0 2.284670e-04
AIRN 1 NaN 0.0 1.721733e-05
APEX1 3 0.6666667 2.0 2.288330e-04
ATR 3 0.3333333 19.5 2.281542e-04

N1中的许多rows.name都存在于N2中,我想比较它们,并在一个新的矩阵中写出差异。那些N1或N2独有的应该提到它们属于N1或N2。



我不知道哪个是计算差异的最佳标准,什么是我可以想到,是N1中一行的所有值的简单相加,并从N2中相应行的相加值中减去该值。



例如输出应该是:

 >头(比较)
比较唯一
2410016O06RIK 0.0002公用
AGO1 -1.83公用
APEX1 2.24公用
ATR 0.0034公用
CASP3 830.00029 N1
ADI1 1.0007288 N2

这里适用于row.name = 2410016O06RIK ,添加来自N1和N2的所有值,然后在比较列中写入 N1-N2 行在两个矩阵中都很常见,因此 common Unique 列中。

在基本R中的一种方法,使用 rowSums 合并



如果 N1 N2 是data.frames:

 #计算行和并合并N1和N2 
N1 $ rs < - rowSums(N1,na.rm = TRUE)
N2 $ rs comp< merge(N1 [,rs,drop = FALSE],N2 [,rs,drop = FALSE],by =row.names,all = TRUE)

#行总和和变量locations
comp $ Unique< - with(comp,c(N1,N2,common)[(!is.na(rs.x))+ 2 *(!is.na(rs.y))])
comp $比较< - with(comp,rs.x-rs.y)

#保持只有变量你需要:
comp< - comp [,c(1,5,4)]

如果 N1 N2 是矩阵: p>

 #计算行和并合并N1和N2 
rs1 < - rowSums(N1,na.rm = TRUE)
rs2 < - rowSums(N2,na.rm = TRUE)
comp < - merge(N1,N2,by =row.names,all = TRUE)

#然后比较行和和变量locations
comp $ Unique< - with(comp,c(N1,N2,common)[as.numeric(!is。 na(Total_Degree.x))+ 2 * as.numeric(!is.na(Total_Degree.y))])
comp $ Comparison< - with(merge(as.data.frame(rs1),as .data.frame(rs2),all = TRUE,by =row.names),rs1-rs2)

#只保留你需要的变量:
comp< [,c(Row.names,Comparison,Unique)]

两种方法的输出:

  comp 
#Row.names比较独特
#1 2410016O06RIK 0.0000844042 common
#2 ADI1 NA N2
#3 AGO1 -1.8332483856 common
#4 AIRN NA N2
#5 APEX1 3.0000856324 common
#6 ATR 0.8334181369 common
#7 CASP3 NA N1
#8 CCND2 NA N1


I have 2 matrices, I want to compare them (row.name wise) to find the difference.

> head(N1)
              Total_Degree Transitivity Betweenness Closeness_All
2410016O06RIK            1          NaN     0.00000  0.0003124024
AGO1                     4    0.1666667    37.00000  0.0003133814
APEX1                    4    0.6666667     4.00000  0.0003144654
ATR                      4    0.1666667    19.50000  0.0003128911
CASP3                   24    0.0000000   806.00000  0.0002980626
CCND2                    4    0.3333333    97.33333  0.0003132832

head(N2)
              Total_Degree Transitivity Betweenness Closeness_All
2410016O06RIK            1          NaN         0.0  2.279982e-04
ADI1                     1          NaN         0.0  1.728877e-05
AGO1                     3    0.0000000        40.0  2.284670e-04
AIRN                     1          NaN         0.0  1.721733e-05
APEX1                    3    0.6666667         2.0  2.288330e-04
ATR                      3    0.3333333        19.5  2.281542e-04

Many of the rows.name in N1 do exist in N2, I want to compare them and write the difference in a new matrix. Those which are unique to N1 or N2 should be mentioned that they either belong to N1 or N2.

I am not sure which is the best criteria to calculate the difference, what I can think of, is a simple addition of all values of a row in N1 and subtract that value from additive value of corresponding row in N2.

For example output should be:

> head(Compared)
                       Comparison Unique 
    2410016O06RIK        0.0002     Common
    AGO1                 -1.83      Common
    APEX1                 2.24      Common
    ATR                  0.0034     Common
    CASP3               830.00029   N1
    ADI1                1.0007288   N2

Here for row.name = 2410016O06RIK, all values from N1 and N2 were added and then N1-N2 was written in Comparison column, as this row was common in both matrices so common was written in Unique column.

解决方案

A way to go in base R, with rowSums and merge:

If N1 and N2 are data.frames:

# compute the row sums and merge N1 and N2
N1$rs <- rowSums(N1, na.rm=TRUE)
N2$rs <- rowSums(N2, na.rm=TRUE)
comp <- merge(N1[, "rs", drop=FALSE], N2[, "rs", drop=FALSE], by="row.names", all=TRUE)

# then compare the row sums and the variable "locations"
comp$Unique <- with(comp, c("N1", "N2", "common")[(!is.na(rs.x)) + 2*(!is.na(rs.y))])
comp$Comparison <- with(comp, rs.x-rs.y)

# keep only the variable you need:
comp <- comp[, c(1, 5, 4)]

If N1 and N2 are matrices:

# compute the row sums and merge N1 and N2
rs1 <- rowSums(N1, na.rm=TRUE)
rs2 <- rowSums(N2, na.rm=TRUE)
comp <- merge(N1, N2, by="row.names", all=TRUE)

# then compare the row sums and the variable "locations"
comp$Unique <- with(comp, c("N1", "N2", "common")[as.numeric(!is.na(Total_Degree.x)) + 2*as.numeric(!is.na(Total_Degree.y))])
comp$Comparison <- with(merge(as.data.frame(rs1), as.data.frame(rs2), all=TRUE, by="row.names"), rs1-rs2)

# keep only the variable you need:
comp <- comp[, c("Row.names", "Comparison", "Unique")]

output of both methods:

comp
#      Row.names    Comparison Unique
#1 2410016O06RIK  0.0000844042 common
#2          ADI1            NA     N2
#3          AGO1 -1.8332483856 common
#4          AIRN            NA     N2
#5         APEX1  3.0000856324 common
#6           ATR  0.8334181369 common
#7         CASP3            NA     N1
#8         CCND2            NA     N1

这篇关于比较矩阵以找出差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆