比较矩阵以找出差异 [英] Compare matrices to find the differences
问题描述
我有两个矩阵,我想比较它们(row.name明智)以找到差异。
> head(N1)
Total_Degree Transitivity Betweenness Closeness_All
2410016O06RIK 1 NaN 0.00000 0.0003124024
AGO1 4 0.1666667 37.00000 0.0003133814
APEX1 4 0.6666667 4.00000 0.0003144654
ATR 4 0.1666667 19.50000 0.0003128911
CASP3 24 0.0000000 806.00000 0.0002980626
CCND2 4 0.3333333 97.33333 0.0003132832
head(N2)
Total_Degree Transitivity Betweenness Closeness_All
2410016O06RIK 1 NaN 0.0 2.279982e-04
ADI1 1 NaN 0.0 1.728877e-05
AGO1 3 0.0000000 40.0 2.284670e-04
AIRN 1 NaN 0.0 1.721733e-05
APEX1 3 0.6666667 2.0 2.288330e-04
ATR 3 0.3333333 19.5 2.281542e-04
N1中的许多rows.name都存在于N2中,我想比较它们,并在一个新的矩阵中写出差异。那些N1或N2独有的应该提到它们属于N1或N2。
我不知道哪个是计算差异的最佳标准,什么是我可以想到,是N1中一行的所有值的简单相加,并从N2中相应行的相加值中减去该值。
例如输出应该是:
>头(比较)
比较唯一
2410016O06RIK 0.0002公用
AGO1 -1.83公用
APEX1 2.24公用
ATR 0.0034公用
CASP3 830.00029 N1
ADI1 1.0007288 N2
这里适用于row.name = 2410016O06RIK
,添加来自N1和N2的所有值,然后在比较
列中写入 N1-N2
行在两个矩阵中都很常见,因此 common
在 Unique
列中。
在基本R中的一种方法,使用 rowSums
和合并
:
如果 N1
和 N2
是data.frames:
#计算行和并合并N1和N2
N1 $ rs < - rowSums(N1,na.rm = TRUE)
N2 $ rs comp< merge(N1 [,rs,drop = FALSE],N2 [,rs,drop = FALSE],by =row.names,all = TRUE)
#行总和和变量locations
comp $ Unique< - with(comp,c(N1,N2,common)[(!is.na(rs.x))+ 2 *(!is.na(rs.y))])
comp $比较< - with(comp,rs.x-rs.y)
#保持只有变量你需要:
comp< - comp [,c(1,5,4)]
如果 N1
和 N2
是矩阵: p>
#计算行和并合并N1和N2
rs1 < - rowSums(N1,na.rm = TRUE)
rs2 < - rowSums(N2,na.rm = TRUE)
comp < - merge(N1,N2,by =row.names,all = TRUE)
#然后比较行和和变量locations
comp $ Unique< - with(comp,c(N1,N2,common)[as.numeric(!is。 na(Total_Degree.x))+ 2 * as.numeric(!is.na(Total_Degree.y))])
comp $ Comparison< - with(merge(as.data.frame(rs1),as .data.frame(rs2),all = TRUE,by =row.names),rs1-rs2)
#只保留你需要的变量:
comp< [,c(Row.names,Comparison,Unique)]
两种方法的输出:
comp
#Row.names比较独特
#1 2410016O06RIK 0.0000844042 common
#2 ADI1 NA N2
#3 AGO1 -1.8332483856 common
#4 AIRN NA N2
#5 APEX1 3.0000856324 common
#6 ATR 0.8334181369 common
#7 CASP3 NA N1
#8 CCND2 NA N1
I have 2 matrices, I want to compare them (row.name wise) to find the difference.
> head(N1)
Total_Degree Transitivity Betweenness Closeness_All
2410016O06RIK 1 NaN 0.00000 0.0003124024
AGO1 4 0.1666667 37.00000 0.0003133814
APEX1 4 0.6666667 4.00000 0.0003144654
ATR 4 0.1666667 19.50000 0.0003128911
CASP3 24 0.0000000 806.00000 0.0002980626
CCND2 4 0.3333333 97.33333 0.0003132832
head(N2)
Total_Degree Transitivity Betweenness Closeness_All
2410016O06RIK 1 NaN 0.0 2.279982e-04
ADI1 1 NaN 0.0 1.728877e-05
AGO1 3 0.0000000 40.0 2.284670e-04
AIRN 1 NaN 0.0 1.721733e-05
APEX1 3 0.6666667 2.0 2.288330e-04
ATR 3 0.3333333 19.5 2.281542e-04
Many of the rows.name in N1 do exist in N2, I want to compare them and write the difference in a new matrix. Those which are unique to N1 or N2 should be mentioned that they either belong to N1 or N2.
I am not sure which is the best criteria to calculate the difference, what I can think of, is a simple addition of all values of a row in N1 and subtract that value from additive value of corresponding row in N2.
For example output should be:
> head(Compared)
Comparison Unique
2410016O06RIK 0.0002 Common
AGO1 -1.83 Common
APEX1 2.24 Common
ATR 0.0034 Common
CASP3 830.00029 N1
ADI1 1.0007288 N2
Here for row.name = 2410016O06RIK
, all values from N1 and N2 were added and then N1-N2
was written in Comparison
column, as this row was common in both matrices so common
was written in Unique
column.
A way to go in base R, with rowSums
and merge
:
If N1
and N2
are data.frames:
# compute the row sums and merge N1 and N2
N1$rs <- rowSums(N1, na.rm=TRUE)
N2$rs <- rowSums(N2, na.rm=TRUE)
comp <- merge(N1[, "rs", drop=FALSE], N2[, "rs", drop=FALSE], by="row.names", all=TRUE)
# then compare the row sums and the variable "locations"
comp$Unique <- with(comp, c("N1", "N2", "common")[(!is.na(rs.x)) + 2*(!is.na(rs.y))])
comp$Comparison <- with(comp, rs.x-rs.y)
# keep only the variable you need:
comp <- comp[, c(1, 5, 4)]
If N1
and N2
are matrices:
# compute the row sums and merge N1 and N2
rs1 <- rowSums(N1, na.rm=TRUE)
rs2 <- rowSums(N2, na.rm=TRUE)
comp <- merge(N1, N2, by="row.names", all=TRUE)
# then compare the row sums and the variable "locations"
comp$Unique <- with(comp, c("N1", "N2", "common")[as.numeric(!is.na(Total_Degree.x)) + 2*as.numeric(!is.na(Total_Degree.y))])
comp$Comparison <- with(merge(as.data.frame(rs1), as.data.frame(rs2), all=TRUE, by="row.names"), rs1-rs2)
# keep only the variable you need:
comp <- comp[, c("Row.names", "Comparison", "Unique")]
output of both methods:
comp
# Row.names Comparison Unique
#1 2410016O06RIK 0.0000844042 common
#2 ADI1 NA N2
#3 AGO1 -1.8332483856 common
#4 AIRN NA N2
#5 APEX1 3.0000856324 common
#6 ATR 0.8334181369 common
#7 CASP3 NA N1
#8 CCND2 NA N1
这篇关于比较矩阵以找出差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!