在Gremlin中,如何在重复开始时执行一次性副作用? [英] How to Perform Once off Side Effect at the Beginning of a Repeat in Gremlin?

查看:73
本文介绍了在Gremlin中,如何在重复开始时执行一次性副作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要运行一个Gremlin查询,该查询从图的叶子(无出线边的顶点)和无边顶点向下移动到图的下方,并收集起始顶点和传入顶点(一次1个级别),直到一个顶点.一定的限制.不得超过此限制,因此,如果下一级别的收入者顶点将导致计数超过该限制,则我们将不收集这些顶点并返回我们拥有的顶点.这是我目前所拥有的:

I need to run a Gremlin query which travels from the leaves (vertices with no outgoing edges) and edgeless vertices of a graph down the graph collecting the starting vertices as well as incoming vertices (1 level at a time) up to a certain limit. This limit must not be exceeded so if the next level of incomer vertices would cause the count to exceed the limit then we do not collect those vertices and return what we have. Here is what I have at the moment:

g.V().or(__.not(outE()),__.not(bothE())).limit(700)
.store('a')
.repeat(__.sideEffect(select('b').store('a')).in().as('b'))
.until(union(cap('a').unfold().count(),select('b').count()).sum().is(gt(700)))
.cap('a').unfold()

问题是 repeat 步骤内的 sideEffect 步骤对于流中的每个顶点都执行一次.我希望它只执行一次,而不管流中有多少个顶点.我该怎么做?

The problem is that the sideEffect step inside of the repeat step is executed once for every vertex in the stream. I want it to be executed only one time regardless of how many vertices are in the stream. How do I accomplish this?

推荐答案

我不确定该方法是否适用于CosmosDB,但它可能是您可以实际使用的唯一不涉及多个Gremlin的Gremlin方法请求和额外处理.我使用此示例图进行演示:

I'm not sure if this approach works on CosmosDB, but it may well be the only approach you can realistically take with Gremlin that doesn't involve multiple Gremlin requests and extra processing. I use this sample graph for demonstration:

g = TinkerGraph.open().traversal()
g.addV().property(id,'A').as('a').
           addV().property(id,'B').as('b').
           addV().property(id,'C').as('c').
           addV().property(id,'D').as('d').
           addV().property(id,'E').as('e').
           addV().property(id,'F').as('f').
           addE('next').from('a').to('b').
           addE('next').from('b').to('c').
           addE('next').from('b').to('d').
           addE('next').from('c').to('e').
           addE('next').from('d').to('e').
           addE('next').from('e').to('f').iterate()

该方法涉及使用 group(),在此情况下,每次循环遍历 repeat()时,您基本上形成一个新的遍历对象分组:

The approach involves use of group() where you essentially form a new grouping of traversed objects each time you loop through the repeat():

gremlin> g.V('A').
......1>   group('m').by(constant(-1)).
......2>   repeat(out().group('m').by(loops())).
......3>   cap('m')
==>[-1:[v[A]],0:[v[B]],1:[v[C],v[D]],2:[v[E],v[E]],3:[v[F],v[F]]]

这为您提供了要处理的数据的结构,现在您只需要确保尽早终止 repeat():

That gives you the structure of the data you want to process, now you just need to make sure you terminate the repeat() as early as possible:

gremlin> g.V('A').
......1>   group('m').by(constant(-1)).
......2>   until(cap('m').select(values).unfold().count(local).sum().is(gte(3))).
......3>     repeat(out().group('m').by(loops())).
......4>   cap('m')
==>[-1:[v[A]],0:[v[B]],1:[v[C],v[D]]]

在上面的例子中,我们看"m".在 until()中进行操作,并对到目前为止收集的所有顶点进行计数.当它超过我们的最大值时,在这种情况下为"3",我们退出.当我们退出时,我们可以看到我们可能已经收集或可能未收集到比我们需要更多的东西.在此示例中,我们做到了,因此我们需要将其丢弃.从技术上讲,您需要除最后一个分组以外的所有分组来满足您的限制,但是很遗憾,除最后一个分组以外的所有分组"格雷姆林并不容易.我最终采用了这种方法,该方法基本上是将最后一个项目扔掉,然后将其用作针对结果的过滤器.注意,我们得到两个结果,因为遍历到下一个级别将超过我们的极限"3".总计结果:

In the above example, we look at "m" in the until() and do a count of all the vertices collected so far. When it exceeds our max ,in this case "3", we quit. When we quit, we can see that we may or may not have collected more than we needed. In this example we did, so we need to throw that away. You technically need all but the last grouping to satisfy your limit, but unfortunately "all but last" is not easy with Gremlin. I ended up with this approach which basically grabs the last item to throw away and then uses it as a filter against the result. Note that we get two results because traversing to that next level would exceed our limit of "3" results total:

gremlin> g.V('A').
......1>   group('m').by(constant(-1)).
......2>   until(cap('m').select(values).unfold().count(local).sum().is(gt(3))).
......3>     repeat(out().group('m').by(loops())).
......4>   cap('m').
......5>   select(values).as('v').
......6>   tail(local).as('e').
......7>   select('v').unfold().
......8>   where(P.neq('e')).
......9>   unfold()
==>v[A]
==>v[B]

请注意,当我们从"3"的极限冲高时,到"4"当遍历到下一个级别时,结果会发生变化,这将使总数增加2,但总数不会超过4.

Note that when we bump from a limit of "3" to "4" the result changes as traversing to the next level will add 2 more to the total but will not exceed 4 total.

gremlin> g.V('A').
......1>   group('m').by(constant(-1)).
......2>   until(cap('m').select(values).unfold().count(local).sum().is(gt(4))).
......3>     repeat(out().group('m').by(loops())).
......4>   cap('m').
......5>   select(values).as('v').
......6>   tail(local).as('e').
......7>   select('v').unfold().
......8>   where(P.neq('e')).
......9>   unfold()
==>v[A]
==>v[B]
==>v[C]
==>v[D]

下一个示例指出,由于您的用例尚不清楚预期的结果(或是否可行),因此我们并未考虑重复项,但希望它为您提供了足够的结构以至少构成遍历您正在寻找:

This next example notes that we don't take duplicates into account with this as it's not clear from your use case what's expected (or if this will even work) but hopefully this provides enough structure for you to at least form the traversal you're looking for:

gremlin> g.V('A').
......1>   group('m').by(constant(-1)).
......2>   until(cap('m').select(values).unfold().count(local).sum().is(gt(5))).
......3>     repeat(out().group('m').by(loops())).
......4>   cap('m').
......5>   select(values).as('v').
......6>   tail(local).as('e').
......7>   select('v').unfold().
......8>   where(P.neq('e')).
......9>   unfold()
==>v[A]
==>v[B]
==>v[C]
==>v[D]

感谢凯尔文·劳伦斯(Kelvin Lawrence)提出了我在此答案中所采用的一般方法.

Thanks to Kelvin Lawrence for suggesting the general approach I've taken in this answer.

这篇关于在Gremlin中,如何在重复开始时执行一次性副作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆