通过Gremlin在大图中的节点/边数? [英] Number of nodes/edges in a large graph via Gremlin?

查看:239
本文介绍了通过Gremlin在大图中的节点/边数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最简单的方法是什么?通过Gremlin来计算大型图中节点/边的数量的最有效方法?我发现的最好的方法是使用V迭代器:

What is the easiest & most efficient way to count the number of nodes/edges in a large graph via Gremlin? The best I have found is using the V iterator:

gremlin> g.V.gather{it.size()}

但是,根据 V的文档,这对于大型图形来说不是可行的选择:

However, this is not a viable option for large graphs, per the documentation for V:

图形的顶点迭代器.利用它遍历所有对象 图中的顶点.除非使用大图,否则请小心使用 结合键索引查找.

The vertex iterator for the graph. Utilize this to iterate through all the vertices in the graph. Use with care on large graphs unless used in combination with a key index lookup.

推荐答案

我认为对所有顶点进行计数的首选方法是:

I think the preferred way to do a count of all vertices would be:

gremlin> g = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> g.V.count()
==>6
gremlin> g.E.count()
==>6

但是,我认为无论执行什么操作,在很大的图上g.V/E都将分解.在很大的图上,进行计数的最佳选择是使用Faunus这样的工具( http://thinkaurelius .github.io/faunus/),以便您可以利用Hadoop的功能并行进行计数.

though, I think that on a very large graph g.V/E just breaks down no matter what you do. On a very large graph the best option for doing a count is to use a tool like Faunus(http://thinkaurelius.github.io/faunus/) so that you can leverage the power of Hadoop to do the counts in parallel.

更新:上面的原始答案适用于TinkerPop2.x.对于TinkerPop 3.x,答案基本相同,并暗示使用 Gremlin Spark 或某些特定于提供商的工具(例如 DSE GraphFrames 适用于DataStax Graph),该功能经过优化,可以进行此类大规模遍历.

UPDATE: The original answer above was for TinkerPop 2.x. For TinkerPop 3.x the answer is largely the same and implies use of Gremlin Spark or some provider specific tooling (like DSE GraphFrames for DataStax Graph) that is optimized to do those kinds of large scale traversals.

这篇关于通过Gremlin在大图中的节点/边数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆