确定链接在一起的链接情节的组 [英] identify groups of linked episodes which chain together
问题描述
采用链接ID的简单数据框:
Take this simple data frame of linked ids:
test <- data.frame(id1=c(10,10,1,1,24,8),id2=c(1,36,24,45,300,11))
> test
id1 id2
1 10 1
2 10 36
3 1 24
4 1 45
5 24 300
6 8 11
我现在想将链接的所有ID组合在一起. 所谓链接",是指按照链接链进行操作,以使一组中的所有ID 一起贴上标签.一种分支结构.即:
I now want to group together all the ids which link. By 'link', I mean follow through the chain of links so that all ids in one group are labelled together. A kind of branching structure. i.e:
Group 1
10 --> 1, 1 --> (24,45)
24 --> 300
300 --> NULL
45 --> NULL
10 --> 36, 36 --> NULL,
Final group members: 10,1,24,36,45,300
Group 2
8 --> 11
11 --> NULL
Final group members: 8,11
现在,我大致了解了我想要的逻辑,但是不知道如何优雅地实现它.我正在考虑递归地使用match
或%in%
遍历每个分支,但这次确实很困惑.
Now I roughly know the logic I would want, but don't know how I would implement it elegantly. I am thinking of a recursive use of match
or %in%
to go down each branch, but am truly stumped this time.
我要追求的最终结果是:
The final result I would be chasing is:
result <- data.frame(group=c(1,1,1,1,1,1,2,2),id=c(10,1,24,36,45,300,8,11))
> result
group id
1 1 10
2 1 1
3 1 24
4 1 36
5 1 45
6 1 300
7 2 8
8 2 11
推荐答案
Bioconductor软件包 RBGL (BOOST图形库的R接口)包含
connectedComp()
函数,用于识别图中的已连接组件-
正是您想要的.
The Bioconductor package RBGL (an R interface to the BOOST graph library) contains
a function, connectedComp()
, which identifies the connected components in a graph --
just what you are wanting.
(要使用该功能,您首先需要安装图和 RBGL 软件包,这些软件包可在此处.)
(To use the function, you will first need to install the graph and RBGL packages, available here and here.)
library(RBGL)
test <- data.frame(id1=c(10,10,1,1,24,8),id2=c(1,36,24,45,300,11))
## Convert your 'from-to' data to a 'node and edge-list' representation
## used by the 'graph' & 'RBGL' packages
g <- ftM2graphNEL(as.matrix(test))
## Extract the connected components
cc <- connectedComp(g)
## Massage results into the format you're after
ld <- lapply(seq_along(cc),
function(i) data.frame(group = names(cc)[i], id = cc[[i]]))
do.call(rbind, ld)
# group id
# 1 1 10
# 2 1 1
# 3 1 24
# 4 1 36
# 5 1 45
# 6 1 300
# 7 2 8
# 8 2 11
这篇关于确定链接在一起的链接情节的组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!