笔形图找到所有最大的集团而没有重叠 [英] R igraph find all maximal cliques without overlapping

查看:122
本文介绍了笔形图找到所有最大的集团而没有重叠的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试查找图中的所有最大集团,不要重叠. 函数max_cliques()返回图形中所有可能的最大派系,但我希望每个顶点仅包含在一个派系中.

例如,如果max_cliques()的输出是以下类别:

{A,B,C},{A,B,D},{A,B,J,K},{E,F,G,H},{E,F,G,I}

我想删除一些集团,以便所有顶点都以一个集团的形式出现,所以最终的集合将是:

{A,B,J,K},{E,F,G,H}

A和B包含在3个派系中,所以我想选择这些派系,以便最终集合将包括最大顶点.如果在相同的长度内有两个可能的小集团,则随机抽取一个. (我不介意不包括所有顶点)

即使不深入讨论集团,我也很想解决这个问题的主意-问题基本上是如何删除包含重叠元素的最短列表".

预先感谢

解决方案

当您询问覆盖范围独立集问题.这些是 NP完全问题.这意味着随着图形的增长,计算时间将成倍增加.

我认为这是您的目标.我的方法如下:

  1. 寻找集团.
  2. 转换为关联矩阵(按节点分类).
  3. 将入射矩阵乘以其转置(%*%),即可创建邻接矩阵
  4. 从邻接矩阵创建团体图(如果其他团体共享一个节点,它们将连接到其他团体)
  5. 找到所有独立的顶点集(这是瓶颈)
  6. 检索原始节点以获取独立的集团
  7. 查找包含最多节点的集合.

代码

library(igraph)  
set.seed(8675309)  
g <- graph_from_edgelist(matrix(sample(LETTERS[1:10], 50, replace=T), ncol = 2), directed = FALSE)  
plot(g, edge.arrow.size=0.5)

cliques <- max_cliques(g)

cliqueBP <- matrix(c(rep(paste0("cl", seq_along(cliques)), sapply(cliques, length)), names(unlist(cliques))), ncol=2, )
bp <- graph_from_edgelist(cliqueBP, directed = F)
V(bp)$type <- grepl("cl", V(bp)$name)
# plot(bp, layout=layout_as_bipartite)

bp.ind <- t(as_incidence_matrix(bp))
bp.adj <- bp.ind %*% t(bp.ind)

bp.adj.g <- graph_from_adjacency_matrix(bp.adj, mode = "undirected")
# plot(simplify(bp.adj.g))
bp.adj.mis <- independent.vertex.sets(bp.adj.g)

sets <- lapply(bp.adj.mis, function(x) cliqueBP[cliqueBP[,1] %in% as_ids(x), 2])
sets[which(sapply(sets, length) == max(sapply(sets, length)))]

# [[1]]
# [1] "G" "J" "E" "I" "B" "H" "F" "D"
# 
# [[2]]
# [1] "G" "J" "E" "I" "F" "C" "B" "H"
# 
# [[3]]
# [1] "G" "J" "E" "I" "F" "C" "A" "H"
# 
# [[4]]
# [1] "G" "B" "E" "I" "F" "C" "A" "H"
# 
# [[5]]
# [1] "G" "B" "E" "I" "F" "C" "H" "J"

I am trying to find all maximal cliques in a graph, without overlapping. the function max_cliques() returns all possible maximal cliques in the graph, but I want every vertex to be included in only one clique- in the largest clique it can be part of.

for example, if the output of the max_cliques() are the following cliques:

{A,B,C}, {A,B,D}, {A,B,J,K}, {E,F,G,H}, {E,F,G,I}

I want to remove some cliques so that all the vertexes will appear in exacly one clique, so the final set will be:

{A,B,J,K}, {E,F,G,H}

A and B are included in 3 cliques, so I want to choose the cliques so that the final set will include maximum vertexes as possible. if there are two possible cliques in the same length- take a random one. (I don't mind to not include all the vertexes)

I would really appreciate an idea to solve this problem, even without going into details of cliques- the question is basically how to remove the shortest "lists" that contain overlapping elements.

thanks in advance

解决方案

This is apparently a pretty hard problem to solve as you asking about Coverage and Independent Set problems. These are NP-complete problems. What this means is that as your graph grows computational time is going to increase exponentially.

I think this what you are aiming for. My approach is as follows:

  1. Find cliques.
  2. Convert to incidence matrix (cliques by nodes).
  3. Multiply the incidence matrix by its transpose (%*%) this creates an adjacency matrix
  4. create graph of cliques from adjacency matrix (cliques are connected to other cliques if they share a node)
  5. Find all independent sets of vertices (this is the bottleneck)
  6. Retrieve original nodes for independent sets of cliques
  7. Find set with most nodes.

Code

library(igraph)  
set.seed(8675309)  
g <- graph_from_edgelist(matrix(sample(LETTERS[1:10], 50, replace=T), ncol = 2), directed = FALSE)  
plot(g, edge.arrow.size=0.5)

cliques <- max_cliques(g)

cliqueBP <- matrix(c(rep(paste0("cl", seq_along(cliques)), sapply(cliques, length)), names(unlist(cliques))), ncol=2, )
bp <- graph_from_edgelist(cliqueBP, directed = F)
V(bp)$type <- grepl("cl", V(bp)$name)
# plot(bp, layout=layout_as_bipartite)

bp.ind <- t(as_incidence_matrix(bp))
bp.adj <- bp.ind %*% t(bp.ind)

bp.adj.g <- graph_from_adjacency_matrix(bp.adj, mode = "undirected")
# plot(simplify(bp.adj.g))
bp.adj.mis <- independent.vertex.sets(bp.adj.g)

sets <- lapply(bp.adj.mis, function(x) cliqueBP[cliqueBP[,1] %in% as_ids(x), 2])
sets[which(sapply(sets, length) == max(sapply(sets, length)))]

# [[1]]
# [1] "G" "J" "E" "I" "B" "H" "F" "D"
# 
# [[2]]
# [1] "G" "J" "E" "I" "F" "C" "B" "H"
# 
# [[3]]
# [1] "G" "J" "E" "I" "F" "C" "A" "H"
# 
# [[4]]
# [1] "G" "B" "E" "I" "F" "C" "A" "H"
# 
# [[5]]
# [1] "G" "B" "E" "I" "F" "C" "H" "J"

这篇关于笔形图找到所有最大的集团而没有重叠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆