使用add_edge_list()方法创建图形的最佳方法是什么? [英] What is the optimal way to create a graph with add_edge_list() method?

查看:377
本文介绍了使用add_edge_list()方法创建图形的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过graph-tool库(在10 ^ 6-10 ^ 7个顶点附近)创建大型图形,并用顶点名称填充顶点属性或使用名称而不是顶点索引.我有:

I am trying to create large graph via graph-tool library (near 10^6 - 10^7 vertices) and fill vertex property with vertex name or use names instead of vertex indexes. I have:

  1. 名称列表:

  1. list of names:

['50', '56', '568']

  • 一组边,但不是顶点索引,而是它们的名称:

  • set of edges, but instead of vertex indexes it consists of their names:

    edge_list = {frozenset({'568', '56'}), frozenset({'56', '50'}), frozenset({'50', '568'})}
    

  • 因为add_edge_list()允许创建折点(如果它们在图形中没有这样的折点).我正在尝试使用它来填充一个空图.它可以正常工作,但是当我尝试通过名称获取顶点时,出现一个错误,即没有具有该索引的顶点.

    Since add_edge_list() allows to create vertices if they are no such vertix in the graph. I'm trying to use it to fill an empty graph. It works ok, but when I was trying to get vertex by its name, I got an error that there are no vertex with such index.

    这是我程序的代码:

    g = grt.Graph(directed=False)
    edge_list = {frozenset({'568', '56'}), frozenset({'56', '50'}), frozenset({'50', '568'})}
    ids = ['50', '56', '568']
    g.add_edge_list(edge_list, hashed=True, string_vals=True)
    print(g.vertex('50'))
    

    print(g.vertex('50'))的错误消息:

    ValueError: Invalid vertex index: 50
    

    我要创建图形:

    1. 仅使用edge_list
    2. 可以通过名称快速访问顶点;
    3. 按时间最佳(如果可能的话,还有RAM).
    1. Using edge_list only;
    2. Having quick access to a vertex by its name;
    3. Optimal by time (and RAM if possible).

    有什么好办法吗?

    当前代码:

    g = grt.Graph(directed=False)
    g.add_vertex(len(ids))
    vprop = g.new_vertex_property("string", vals=ids)
    g.vp.user_id = vprop  
    for vert1, vert2 in edges_list:
        g.add_edge(g.vertex(ids_dict[vert1]), g.vertex(ids_dict[vert2]))
    

    推荐答案

    如果您有一个包含10 ^ 6-10 ^ 7个顶点的密集图(是一些医学数据还是社交图?它可以改变一切) ,您不应该使用networkx,因为它是在纯Python上编写的,因此它比graph-tooligraph慢10-100倍.对于您的情况,我建议您使用graph-tool.这是最快的(〜c igraph)Python图形处理库.

    If you have a dense graph with 10^6 - 10^7 vertices (Is it some medical data or social graph? It can change everything), you shouldn't use networkx because it is written on pure Python so it is ~10-100 times slower than graph-tool or igraph. In your case I recommend you to use graph-tool. It is the fastest (~as igraph) Python graph processing library.

    graph-tool的行为不同于networkx.创建networkx节点时,其标识符就是您在节点构造函数中编写的内容,因此可以通过其ID获取该节点.在graph-tool中,每个顶点ID是从1到GRAPH_SIZE的整数:

    graph-tool behaviour differs from networkx. When you create the networkx node, its identifier is what you wrote in node constructor so you can get the node by its ID. In graph-tool every vertex ID is the integer from 1 to GRAPH_SIZE:

    图中的每个顶点都有一个唯一索引,该索引始终在0到N-1之间,其中N是顶点数.可以通过使用图形的vertex_index属性(此属性为属性图,请参见属性图)或将顶点描述符转换为int来获得此索引.

    Each vertex in a graph has an unique index, which is always between 0 and N−1, where N is the number of vertices. This index can be obtained by using the vertex_index attribute of the graph (which is a property map, see Property maps), or by converting the vertex descriptor to an int.

    关于图形,顶点或边的所有其他信息都存储在属性图.当您将.add_edge_list()hashed=True一起使用时,新的属性映射将作为.add_edge_list()的结果返回.因此,在您的情况下,应该这样处理顶点:

    Every additional information about graph, vertices or edges is stored in property maps. And when you are using .add_edge_list() with hashed=True, the new property map is returned as the result of .add_edge_list(). So in your case you should handle your vertices like this:

    # Create graph
    g = grt.Graph(directed=False)
    
    # Create edge list
    # Why frozensets? You don't really need them. You can use ordinary sets or tuples
    edge_list = {
        frozenset({'568', '56'}),
        frozenset({'56', '50'}),
        frozenset({'50', '568'})
    }
    
    # Write returned PropertyMap to a variable!
    vertex_ids = g.add_edge_list(edge_list, hashed=True, string_vals=True)
    
    g.vertex(1)
    
    Out [...]: <Vertex object with index '1' at 0x7f3b5edde4b0>
    
    vertex_ids[1]
    
    Out [...]: '56'
    

    如果要根据ID获取顶点,则应手动构造映射字典(好吧,我不是graph-tool专家,但我找不到简单的解决方案):

    If you want to get a vertex according to the ID, you should construct mapping dict manually (well, I am not a graph-tool guru, but I can't find simple solution):

    very_important_mapping_dict = {vertex_ids[i]: i for i in range(g.num_vertices())}

    因此您可以轻松获得顶点索引:

    So you can easily get a vertex index:

    very_important_mapping_dict['568']
    
    Out [...]: 0
    
    vertex_ids[0]
    
    Out [...]: '568'
    

    这篇关于使用add_edge_list()方法创建图形的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆