从python数据框的列构造二分图 [英] Construct bipartite graph from columns of python dataframe

查看：927 发布时间：2017/3/26 2:33:20 python graph dataframe networkx

本文介绍了从python数据框的列构造二分图的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含三列的数据框。

I have a dataframe with three columns.

data['subdomain'],  data['domain'], data ['IP']

我想为subdomain的每个元素建立一个二分图，
对应于同一个域，权重为
对应的次数。

I want to build one bipartite graph for every element of subdomain that corresponds to the same domain, and the weight to be the number of times that it corresponds.

例如我的数据可能是：

subdomain , domain, IP
test1, example.org, 10.20.30.40
something, site.com, 30.50.70.90
test2, example.org, 10.20.30.41
test3, example.org, 10.20.30.42
else, website.com, 90.80.70.10

我想要一个二分图，说明 example.org 的权重为3
3边缘等等，我想将这些结果组合成一个新的
数据框。

I want a bipartite graph stating that example.org has a weight of 3 as it has 3 edges on it etc. And I want to group these results together into a new dataframe.

我一直在尝试使用 networkX ，但是我没有经验，特别是当边缘需要计算时。

I have been trying with networkX but I have no experience especially when the edges need to be computed.

B=nx.Graph()
B.add_nodes_from(data['subdomain'],bipartite=0)
B.add_nodes_from(data['domain'],bipartite=1)
B.add_edges_from (...)

推荐答案

您可以使用

B.add_weighted_edges_from(
    [(row['domain'], row['subdomain'], 1) for idx, row in df.iterrows()], 
    weight='weight')

添加加权边缘，或者您可以使用

to add weighted edges, or you could use

B.add_edges_from(
    [(row['domain'], row['subdomain']) for idx, row in df.iterrows()])

添加没有权重的边。

您可能不需要权重，因为节点度数是与该节点相邻的
的边数。例如，

You may not need weights since the node degree is the number of edges adjacent to that node. For example,

>>> B.degree('example.org')
3

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt

df = pd.DataFrame(
    {'IP': ['10.20.30.40',
      '30.50.70.90',
      '10.20.30.41',
      '10.20.30.42',
      '90.80.70.10'],
     'domain': ['example.org',
      'site.com',
      'example.org',
      'example.org',
      'website.com'],
     'subdomain': ['test1', 'something', 'test2', 'test3', 'else']})

B = nx.Graph()
B.add_nodes_from(df['subdomain'], bipartite=0)
B.add_nodes_from(df['domain'], bipartite=1)
B.add_weighted_edges_from(
    [(row['domain'], row['subdomain'], 1) for idx, row in df.iterrows()], 
    weight='weight')

print(B.edges(data=True))
# [('test1', 'example.org', {'weight': 1}), ('test3', 'example.org', {'weight': 1}), ('test2', 'example.org', {'weight': 1}), ('website.com', 'else', {'weight': 1}), ('site.com', 'something', {'weight': 1})]

pos = {node:[0, i] for i,node in enumerate(df['domain'])}
pos.update({node:[1, i] for i,node in enumerate(df['subdomain'])})
nx.draw(B, pos, with_labels=False)
for p in pos:  # raise text positions
    pos[p][1] += 0.25
nx.draw_networkx_labels(B, pos)

plt.show()

这篇关于从python数据框的列构造二分图的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从python数据框的列构造二分图 [英] Construct bipartite graph from columns of python dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从python数据框的列构造二分图 [英] Construct bipartite graph from columns of python dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭