gremlin-查询优化-多个间隔范围的属性值计数 [英] gremlin - query optimization - property value counts for multiple interval ranges

查看：58 发布时间：2021/5/13 19:30:03 gremlin

本文介绍了gremlin-查询优化-多个间隔范围的属性值计数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给出一个顶点，一个属性和预定义的间隔范围 [(0,100)，(100,500)，(500,1000)，(1000、5000)，...] ，我想要为边缘属性值所在的每个间隔计算顶点的边缘计数.

Given a vertex, a property, and pre-defined interval ranges [(0,100), (100,500), (500,1000), (1000, 5000), ...], I want to compute the vertex's edge count for each interval for where an edge's property value falls.

例如，顶点 446656 有5条边，每条边都有一个属性 trxn_amt ，其值如下: [92，380，230，899，102] .这将给出组计数 {(0,100):1，(100,500):3，(500,1000):1，(1000，5000):0，...} .

For example, the vertex 446656 has 5 edges, which each have a property trxn_amt with the following values: [92, 380, 230, 899, 102]. This would give group counts {(0,100): 1, (100,500): 3, (500,1000):1, (1000, 5000):0, ...}.

我的问题分为两个部分.

My question is split into two parts.

首先，是否有比以下项目查询更干净的实施方式?

g.V(446656).project('num_trxn_0_100', 'num_trxn_100_500')
    .by(bothE().where(values('trxn_amt').is(between(0.0, 100.0))).count())
    .by(bothE().where(values('trxn_amt').is(between(100.0, 500.0))).count())

==>{num_trxn_0_100=1, num_trxn_100_500=3}

^想象更多的时间间隔

其次，我该如何包含未多次计算的边缘过滤器?

我想添加一个日期过滤器(即 bothE()-> bothE().has('trxn_dt_int'，lt(999999999999))，然后不要添加不想为每个 .by(...)步骤多次计算此过滤器.是否有一种方法可以一次性计算此过滤器，将其存储起来，以后再使用-或者，如果我确实多次包含它，是否有任何优化措施可以确保仅计算一次?

I want to add in a date filter (i.e. bothE() -> bothE().has('trxn_dt_int', lt(999999999999)), and don't want to compute this filter multiple times for each .by(...) step. Is there a way to compute this filter a single time, store it, and use it later - or alternatively, if I do include it multiple times, is there any optimization that happens under the hood to make sure it's only computed a single time?

推荐答案

首先，有没有比以下项目查询更干净的实施方式?

Firstly, is there a cleaner implementation than the following project query?

我认为您意识到了这种方法的问题，这就是为什么您要问这个问题-您多次遍历 bothE()以获得答案.我认为这与您的第二个问题有关:

I think you realized the issue with that approach which is why you are asking the question - you traverse bothE() multiple times to get your answer. And I think that ties into your second question of:

第二，如何包含未多次计算的边缘过滤器?

Secondly, how can I include an edge filter which isn't computed multiple times?

我认为您可以使用 groupCount()更好地编写此查询.为了演示我已经使用了Grateful Dead图:

I think that you can better write this query with groupCount(). To demonstrate I've used the Grateful Dead graph:

gremlin> g = TinkerFactory.createGratefulDead().traversal()
==>graphtraversalsource[tinkergraph[vertices:808 edges:8049], standard]
gremlin> g.V(3).
......1>   bothE('followedBy').
......2>   groupCount().
......3>     by(choose(values('weight')).
......4>          option(between(0, 24), constant('small')).
......5>          option(between(25, 99), constant('medium')).
......6>          option(gte(100), constant('big')))
==>[small:140,big:2,medium:7]

现在，只需在 groupCount()之前为边缘添加日期过滤器，它只需发生一次即可.

Now just add your date filter for the edges prior to groupCount() and it only has to happen once.

这篇关于gremlin-查询优化-多个间隔范围的属性值计数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

gremlin-查询优化-多个间隔范围的属性值计数 [英] gremlin - query optimization - property value counts for multiple interval ranges

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

gremlin-查询优化-多个间隔范围的属性值计数 [英] gremlin - query optimization - property value counts for multiple interval ranges

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭