在python中使用iGraph进行社区检测并将每个节点的社区编号写入CSV [英] Using iGraph in python for community detection and writing community number for each node to CSV

查看:498
本文介绍了在python中使用iGraph进行社区检测并将每个节点的社区编号写入CSV的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个网络,我想使用iGraph中的edge_betweenness社区检测算法进行分析.我熟悉NetworkX,但是由于它是NetworkX上的其他社区检测方法,因此我尝试学习iGraph.

I have an network that I would like to analyze using the edge_betweenness community detection algorithm in iGraph. I'm familiar with NetworkX, but am trying to learning iGraph because of it's additional community detection methods over NetworkX.

我的最终目标是运行edge_betweenness社区检测并找到最佳数量的社区,并为图中的每个节点编写一个具有社区成员身份的CSV文件.

My ultimate goal is to run edge_betweenness community detection and find the optimal number of communities and write a CSV with community membership for each node in the graph.

下面是我目前的代码.任何帮助弄清社区成员身份的帮助都将不胜感激.

Below is my code as it currently stands. Any help figuring out community membership is greatly appreciated.

输入数据('network.txt'):

1 2
2 3
2 7
3 1
4 2
4 6
5 4
5 6
7 4
7 8
8 9
9 7
10 7
10 8
10 9

iGraph代码

import igraph

# load data into a graph
g = igraph.Graph.Read_Ncol('network.txt')

# plot graph
igraph.plot(g)

# identify communities
communities = igraph.community_edge_betweenness()

# not really sure what to do next
num_communities = communities.optimal_count
communities.as_clustering(num_communities)

我该怎么做才能找到最佳数目的社区并写出图中每个节点属于一个列表的社区?

推荐答案

您处在正确的轨道上;可以通过communities.optimal_count检索社区的最佳数量(其中最优"被定义为最大化模块化得分的社区数量"),并且可以使用communities.as_clustering(num_communities)将社区结构转换为平坦的不连续聚类.如果恰好等于communities.optimal_count,则可以忽略社区数目.完成后,您将获得一个具有membership属性的VertexClustering对象,该属性为您提供图形中每个顶点的聚簇索引

You are on the right track; the optimal number of communities (where "optimal" is defined as "the number of communities that maximizes the modularity score) can be retrieved by communities.optimal_count and the community structure can be converted into a flat disjoint clustering using communities.as_clustering(num_communities). Actually, the number of communities can be omitted if it happens to be equal to communities.optimal_count. Once you've done that, you get a VertexClustering object with a membership property which gives you the cluster index for each vertex in the graph.

为清楚起见,我将您的communities变量重命名为dendrogram,因为边缘中间性社区检测算法实际上会生成树状图::

For sake of clarity, I'm renaming your communities variable to dendrogram because the edge betweenness community detection algorithm actually produces a dendrogram::

# calculate dendrogram
dendrogram = graph.community_edge_betweenness()
# convert it into a flat clustering
clusters = dendrogram.as_clustering()
# get the membership vector
membership = clusters.membership

现在,我们可以开始将成员矢量和节点名称一起写入CSV文件:

Now we can start writing the membership vector along with the node names into a CSV file::

import csv
from itertools import izip

writer = csv.writer(open("output.csv", "wb"))
for name, membership in izip(graph.vs["name"], membership):
    writer.writerow([name, membership])

如果您使用的是Python 3,请使用zip而不是izip,并且无需导入itertools.

If you are using Python 3, use zip instead of izip and there is no need to import itertools.

这篇关于在python中使用iGraph进行社区检测并将每个节点的社区编号写入CSV的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆