如何在R中同时进行三个领域的网络分析 [英] How to do network analysis on three fields simultaneously in R
本文介绍了如何在R中同时进行三个领域的网络分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
如何在R中同时对三个字段进行网络分析。
以下是示例数据以及最后一栏中的desired output
。
df <- data.frame(
stringsAsFactors = FALSE,
id_1 = c("ABC","ABC","BCD",
"CDE","DEF","EFG","GHI","HIJ","IJK","JKL",
"GHI","KLM","LMN","MNO","NOP"),
id_2 = c("1A","2A","3A",
"1A","4A","5A","6A","8A","9A","10A","7A",
"12A","13A","14A","15A"),
id_3 = c("Z3","Z2","Z1",
"Z4","Z1","Z5","Z5","Z6","Z7","Z8","Z6","Z8",
"Z9","Z9","Z1"),
Name = c("StackOverflow1",
"StackOverflow2","StackOverflow3","StackOverflow4",
"StackOverflow5","StackOverflow6",
"StackOverflow7","StackOverflow8","StackOverflow9",
"StackOverflow10","StackOverflow11","StackOverflow12",
"StackOverflow13","StackOverflow14","StackOverflow15"),
desired_output = c(1L,1L,2L,1L,2L,
3L,3L,3L,4L,5L,3L,5L,6L,6L,2L)
)
df
#> id_1 id_2 id_3 Name desired_output
#> 1 ABC 1A Z3 StackOverflow1 1
#> 2 ABC 2A Z2 StackOverflow2 1
#> 3 BCD 3A Z1 StackOverflow3 2
#> 4 CDE 1A Z4 StackOverflow4 1
#> 5 DEF 4A Z1 StackOverflow5 2
#> 6 EFG 5A Z5 StackOverflow6 3
#> 7 GHI 6A Z5 StackOverflow7 3
#> 8 HIJ 8A Z6 StackOverflow8 3
#> 9 IJK 9A Z7 StackOverflow9 4
#> 10 JKL 10A Z8 StackOverflow10 5
#> 11 GHI 7A Z6 StackOverflow11 3
#> 12 KLM 12A Z8 StackOverflow12 5
#> 13 LMN 13A Z9 StackOverflow13 6
#> 14 MNO 14A Z9 StackOverflow14 6
#> 15 NOP 15A Z1 StackOverflow15 2
实际上我可以使用igraph
同时对两个字段执行网络分析,如我自己的答案here,但我不能对两个字段执行网络分析。
请帮帮忙。
我感觉我目前的方法(2次迭代)可以优化。
library(igraph)
library(tidyverse)
graph.data.frame(df) %>%
components() %>%
pluck(membership) %>%
stack() %>%
set_names(c('GRP', 'id_1')) %>%
right_join(df %>% mutate(id_1 = as.factor(id_1)), by = c('id_1')) %>%
select(GRP, id_3) %>%
graph.data.frame() %>%
components() %>%
pluck(membership) %>%
stack() %>%
set_names(c('GRP', 'id_3')) %>%
right_join(df %>% mutate(id_3 = as.factor(id_3)), by = c('id_3'))
#> GRP id_3 id_1 id_2 Name desired_output
#> 1 1 Z3 ABC 1A StackOverflow1 1
#> 2 1 Z2 ABC 2A StackOverflow2 1
#> 3 2 Z1 BCD 3A StackOverflow3 2
#> 4 2 Z1 DEF 4A StackOverflow5 2
#> 5 2 Z1 NOP 15A StackOverflow15 2
#> 6 1 Z4 CDE 1A StackOverflow4 1
#> 7 3 Z5 EFG 5A StackOverflow6 3
#> 8 3 Z5 GHI 6A StackOverflow7 3
#> 9 3 Z6 HIJ 8A StackOverflow8 3
#> 10 3 Z6 GHI 7A StackOverflow11 3
#> 11 4 Z7 IJK 9A StackOverflow9 4
#> 12 5 Z8 JKL 10A StackOverflow10 5
#> 13 5 Z8 KLM 12A StackOverflow12 5
#> 14 6 Z9 LMN 13A StackOverflow13 6
#> 15 6 Z9 MNO 14A StackOverflow14 6
由reprex package(v2.0.1)于2021-11-15创建
推荐答案
创建由id列和行号定义的顶点之间的所有连接的列表(函数f
)。最后,您只对行之间的连接感兴趣。
f <- function(vec){
i <- last(vec)
vec <- head(vec, -1)
c(
seq_len(length(vec) - 1) %>% map(~vec[.x:(.x+1)]),
vec %>% map(~c(i, .x))
)
}
df$desired_output <- df %>%
select(matches("^id_[0-9]+$")) %>%
mutate(row = row_number()) %>%
pmap(~f(c(...))) %>%
flatten() %>%
reduce(rbind) %>%
igraph::graph_from_edgelist() %>%
components() %>%
membership() %>%
.[as.character(seq_len(nrow(df)))]
编辑
想象一下ID之间的连接。您对行之间的连接感兴趣。为此,您需要为每行添加顶点。这些顶点连接到该行中的所有ID。第6行示例:
6 EFG 5A Z5
我们对ID之间的连接感兴趣(c
中的第一部分f
中:
[[1]]
[1] "EFG" "5A"
[[2]]
[1] "5A" "Z5"
和行与ID之间的连接(f
中c
的第二部分):
[[1]]
[1] "6" "EFG"
[[2]]
[1] "6" "5A"
[[3]]
[1] "6" "Z5"
当您以这种方式创建图表时,结果是:
并且您感兴趣的是连接了哪些行顶点
备注
您可以在为此结果创建图表时使用directed = FALSE
,如果您对此感兴趣,可以使用mode = "strong"
中的mode = "strong"
。
这篇关于如何在R中同时进行三个领域的网络分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文