R-比较两个分层列表之间共有元素的循环 [英] R - A loop comparing elements in common between two hierarchical lists
问题描述
一段时间以来,我一直在尝试构建一个矩阵,该矩阵由两个等级列表之间共有的元素数量组成.
I have been trying, for some time, to build a matrix populated by the counts of elements in common between two herarchical lists.
以下是一些虚拟数据:
site<-c('A','A','A','A','A','A','A','A','A','B','B','B','B','B','B')
group<-c('A1','A1','A2','A2','A2','A3','A3','A3','A3',
'B1','B1','B2','B2','B2','B2')
element<-c("red","orange","blue","black","white", "black","cream","yellow","purple","red","orange","blue","white","gray","salmon")
d<-cbind(site,group,element)
我创建了一个列表结构,假设由于每个列表中os元素的数量不同,该结构将是程序性的.另外,由于我不想在组之间进行所有可能的比较,而只希望在站点之间进行比较.
I created a list structure, assuming it would be procedural due to the different number os elements in each list. Also, since I don´t want every possible comparison between groups, but only between sites.
#first level list - by site
sitelist<-split(nodmod, list(nodmod$site),drop = TRUE)
#list by group
nestedlist <- lapply(sitelist, function(x) split(x, x[['mod']], drop = TRUE))
我的意图是创建一个表或矩阵,其中两个站点的组之间的元素数量相同(我的原始数据还有其他站点).像这样:
My intention is to create a table, or matrix with the number of element in common between groups from the two sites (my original data has additional sites). Like such:
A1 A2 A3
B1 2 0 0
B2 0 2 0
这个问题的嵌套性质对我来说具有挑战性.我对列表不太熟悉,因为我主要使用数据框解决了问题.我的尝试归结为这一点.我觉得它已经接近了,但是在循环的正确语法上有很多缺点.
The nested nature of this problem is challenging to me. I am not as familiar with lists, as I´ve solved problems mostly using dataframes. My attempt boiled down to this. I felt it got close, but have many shortcomings with the correct syntax for loops.
t <- outer(1:length(d$A),
1:length(d$B),
FUN=function(i,j){
sapply(1:length(i),
FUN=function(x)
length(intersect(d$A[[i]]$element, d$B[[j]]$element)) )
})
任何帮助将不胜感激.如果解决了类似的问题,我们深表歉意.我已经搜索了互联网,但是没有找到它,或者没有理解使它可以转让给我的解决方案.
Any help would be much appreciated. Apologies if a similar problem has been solved. I have scoured the internet, but have not found it, or did not comprehend the solution to make it transferable to mine.
推荐答案
类似于@Parfait使用矩阵乘法的方法.您可能需要尝试处理数据生成,以将其扩展到您的应用程序:
A similar approach to @Parfait's using matrix multiplication. You may need to play around with the data generation to extend it to your application:
site<-c('A','A','A','A','A','A','A','A','A','B','B','B','B','B','B')
group<-c('A1','A1','A2','A2','A2','A3','A3','A3','A3',
'B1','B1','B2','B2','B2','B2')
element<-c("red","orange","blue","black","white", "black","cream","yellow","purple","red","orange","blue","white","gray","salmon")
d<-data.frame(group, el = as.factor(element), stringsAsFactors = FALSE)
As <- d[group %in% paste0("A", 1:3), ]
Bs <- d[group %in% paste0("B", 1:2), ]
A_mat <- as.matrix(table(As))
B_mat <- as.matrix(table(Bs))
结果:
> A_mat
el
group black blue cream gray orange purple red salmon white yellow
A1 0 0 0 0 1 0 1 0 0 0
A2 1 1 0 0 0 0 0 0 1 0
A3 1 0 1 0 0 1 0 0 0 1
> B_mat
el
group black blue cream gray orange purple red salmon white yellow
B1 0 0 0 0 1 0 1 0 0 0
B2 0 1 0 1 0 0 0 1 1 0
> B_mat %*% t(A_mat)
group
group A1 A2 A3
B1 2 0 0
B2 0 2 0
这篇关于R-比较两个分层列表之间共有元素的循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!