在python中使用iGraph进行社区检测并将每个节点的社区编号写入CSV [英] Using iGraph in python for community detection and writing community number for each node to CSV
问题描述
我有一个网络,我想使用iGraph中的edge_betweenness
社区检测算法进行分析.我熟悉NetworkX,但是由于它是NetworkX上的其他社区检测方法,因此我尝试学习iGraph.
I have an network that I would like to analyze using the edge_betweenness
community detection algorithm in iGraph. I'm familiar with NetworkX, but am trying to learning iGraph because of it's additional community detection methods over NetworkX.
我的最终目标是运行edge_betweenness
社区检测并找到最佳数量的社区,并为图中的每个节点编写一个具有社区成员身份的CSV文件.
My ultimate goal is to run edge_betweenness
community detection and find the optimal number of communities and write a CSV with community membership for each node in the graph.
下面是我目前的代码.任何帮助弄清社区成员身份的帮助都将不胜感激.
Below is my code as it currently stands. Any help figuring out community membership is greatly appreciated.
输入数据('network.txt'):
1 2
2 3
2 7
3 1
4 2
4 6
5 4
5 6
7 4
7 8
8 9
9 7
10 7
10 8
10 9
iGraph代码
import igraph
# load data into a graph
g = igraph.Graph.Read_Ncol('network.txt')
# plot graph
igraph.plot(g)
# identify communities
communities = igraph.community_edge_betweenness()
# not really sure what to do next
num_communities = communities.optimal_count
communities.as_clustering(num_communities)
我该怎么做才能找到最佳数目的社区并写出图中每个节点属于一个列表的社区?
推荐答案
您处在正确的轨道上;可以通过communities.optimal_count
检索社区的最佳数量(其中最优"被定义为最大化模块化得分的社区数量"),并且可以使用communities.as_clustering(num_communities)
将社区结构转换为平坦的不连续聚类.如果恰好等于communities.optimal_count
,则可以忽略社区数目.完成后,您将获得一个具有membership
属性的VertexClustering
对象,该属性为您提供图形中每个顶点的聚簇索引
You are on the right track; the optimal number of communities (where "optimal" is defined as "the number of communities that maximizes the modularity score) can be retrieved by communities.optimal_count
and the community structure can be converted into a flat disjoint clustering using communities.as_clustering(num_communities)
. Actually, the number of communities can be omitted if it happens to be equal to communities.optimal_count
. Once you've done that, you get a VertexClustering
object with a membership
property which gives you the cluster index for each vertex in the graph.
为清楚起见,我将您的communities
变量重命名为dendrogram
,因为边缘中间性社区检测算法实际上会生成树状图::
For sake of clarity, I'm renaming your communities
variable to dendrogram
because the edge betweenness community detection algorithm actually produces a dendrogram::
# calculate dendrogram
dendrogram = graph.community_edge_betweenness()
# convert it into a flat clustering
clusters = dendrogram.as_clustering()
# get the membership vector
membership = clusters.membership
现在,我们可以开始将成员矢量和节点名称一起写入CSV文件:
Now we can start writing the membership vector along with the node names into a CSV file::
import csv
from itertools import izip
writer = csv.writer(open("output.csv", "wb"))
for name, membership in izip(graph.vs["name"], membership):
writer.writerow([name, membership])
如果您使用的是Python 3,请使用zip
而不是izip
,并且无需导入itertools
.
If you are using Python 3, use zip
instead of izip
and there is no need to import itertools
.
这篇关于在python中使用iGraph进行社区检测并将每个节点的社区编号写入CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!