将具有属性和边的节点从 DataFrame 加载到 NetworkX [英] Load nodes with attributes and edges from DataFrame to NetworkX

查看:56
本文介绍了将具有属性和边的节点从 DataFrame 加载到 NetworkX的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是使用 Python 处理图形的新手:NetworkX.到目前为止,我一直在使用 Gephi.标准步骤(但不是唯一可能的)是:

I am new using Python for working with graphs: NetworkX. Until now I have used Gephi. There the standard steps (but not the only possible) are:

  1. 从表格/电子表格中加载节点信息;其中一列应该是 ID,其余的列是关于节点的元数据(节点是人,所以性别,组......通常用于着色).喜欢:

  1. Load the nodes informations from a table/spreadsheet; one of the columns should be ID and the rest are metadata about the nodes (nodes are people, so gender, groups... normally to be used for coloring). Like:

id;NormalizedName;Gender
per1;Jesús;male
per2;Abraham;male
per3;Isaac;male
per4;Jacob;male
per5;Judá;male
per6;Tamar;female
...

  • 然后也从表格/电子表格加载边,使用与节点电子表格的列 ID 相同的节点名称,通常有四列(目标、来源、权重和类型):

  • Then load the edges also from a table/spreadsheet, using the same names for the nodes as it was in the column ID of the nodes spreadsheet with normally four columns (Target, Source, Weight and Type):

    Target;Source;Weight;Type
    per1;per2;3;Undirected
    per3;per4;2;Undirected
    ...
    

  • 这是我拥有的两个数据帧,我想在 Python 中加载它们.阅读有关 NetworkX 的信息,似乎不太可能将两个表(一个用于节点,一个用于边)加载到同一个图中,我不确定最好的方法是什么:

    This are the two dataframes that I have and that I want to load in Python. Reading about NetworkX, it seems that it's not quite possible to load two tables (one for nodes, one for edges) into the same graph and I am not sure what would be the best way:

    1. 我是否应该只使用来自 DataFrame 的节点信息创建一个图形,然后添加(附加)来自另一个 DataFrame 的边?如果是这样并且由于 nx.from_pandas_dataframe() 需要有关边的信息,我想我不应该使用它来创建节点...我应该将信息作为列表传递吗?

    1. Should I create a graph only with the nodes informations from the DataFrame, and then add (append) the edges from the other DataFrame? If so and since nx.from_pandas_dataframe() expects information about the edges, I guess I shouldn't use it to create the nodes... Should I just pass the information as lists?

    我是否应该仅使用来自 DataFrame 的边信息创建图形,然后将来自其他 DataFrame 的信息作为属性添加到每个节点?有没有比迭代 DataFrame 和节点更好的方法?

    Should I create a graph only with the edges information from the DataFrame and then add to each node the information from the other DataFrame as attributes? Is there a better way for doing that than iterating over the DataFrame and the nodes?

    推荐答案

    使用 nx.from_pandas_dataframe:

    import networkx as nx
    import pandas as pd
    
    edges = pd.DataFrame({'source' : [0, 1],
                          'target' : [1, 2],
                          'weight' : [100, 50]})
    
    nodes = pd.DataFrame({'node' : [0, 1, 2],
                          'name' : ['Foo', 'Bar', 'Baz'],
                          'gender' : ['M', 'F', 'M']})
    
    G = nx.from_pandas_dataframe(edges, 'source', 'target', 'weight')
    

    然后使用 set_node_attributes:

    nx.set_node_attributes(G, 'name', pd.Series(nodes.name, index=nodes.node).to_dict())
    nx.set_node_attributes(G, 'gender', pd.Series(nodes.gender, index=nodes.node).to_dict())
    

    或者遍历图添加节点属性:

    Or iterate over the graph to add the node attributes:

    for i in sorted(G.nodes()):
        G.node[i]['name'] = nodes.name[i]
        G.node[i]['gender'] = nodes.gender[i]
    

    更新:

    nx 2.0 开始,nx.set_node_attributes 的参数顺序有 已更改:(G, values, name=None)

    Update:

    As of nx 2.0 the argument order of nx.set_node_attributes has changed: (G, values, name=None)

    使用上面的例子:

    nx.set_node_attributes(G, pd.Series(nodes.gender, index=nodes.node).to_dict(), 'gender')
    

    nx 2.4 开始,G.node[] 被替换为 G.nodes[].

    这篇关于将具有属性和边的节点从 DataFrame 加载到 NetworkX的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆