带有子图聚合的递归查询(任意深度) [英] Recursive query with sub-graph aggregation (arbitrary depth)

查看:65
本文介绍了带有子图聚合的递归查询(任意深度)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

前面的问题问过一个关于沿图形聚合数量的问题.提供的两个答案很好用,但是现在我试图将Cypher查询扩展到可变深度图.

I asked a question earlier about aggregating quantities along a graph. The two answers provided worked well, but now I am trying to extend the Cypher query it to a graph of variable depth.

总而言之,我们从一堆叶子存储开始,所有这些叶子存储都与特定的供应商相关联,这是Store节点上的一个属性.然后将库存转移到其他商店,每个供应商所占的比例与其对原始商店的贡献相对应.

To summarize we start of with a bunch of leaf stores which all are associated with a particular supplier, which is a property on the Store node. Inventory is then moved along to other stores and the proportion from each supplier corresponds to their contribution to the original store.

因此,对于节点B02S2贡献了750/1250 = 60%,而S3贡献了40%.然后,我们将B02的600个单位移动,其中60%属于S2,而40%属于S3,依此类推.

So for node B02, S2 contributed 750/1250 = 60% and S3 contributed 40%. We then move 600 units our of B02 of which 60% belongs to S2 and 40% to S3 and so on.

我们想知道最终进入D01的700个单位中每个供应商所占的百分比.其中具有相同名称的供应商是相同的供应商.因此,对于以上图表,我们期望:

What we want to know what percentage of the final 700 units into D01 belong to each supplier. Where suppliers with the same name are the same supplier. So for the above graph we expect:

S1,38.09
S2,27.61
S3,34.28

S1, 38.09
S2, 27.61
S3, 34.28

我已经使用此Cypher脚本准备了一个图形:

I've prepared a graph using this Cypher script:

CREATE (A01:Store {Name: 'A01', Supplier: 'S1'})
CREATE (A02:Store {Name: 'A02', Supplier: 'S1'})
CREATE (A03:Store {Name: 'A03', Supplier: 'S2'})
CREATE (A04:Store {Name: 'A04', Supplier: 'S3'})
CREATE (A05:Store {Name: 'A05', Supplier: 'S1'})
CREATE (A06:Store {Name: 'A06', Supplier: 'S1'})
CREATE (A07:Store {Name: 'A07', Supplier: 'S2'})
CREATE (A08:Store {Name: 'A08', Supplier: 'S3'})

CREATE (B01:Store {Name: 'B01'})
CREATE (B02:Store {Name: 'B02'})
CREATE (B03:Store {Name: 'B03'})
CREATE (B04:Store {Name: 'B04'})

CREATE (C01:Store {Name: 'C01'})
CREATE (C02:Store {Name: 'C02'})

CREATE (D01:Store {Name: 'D01'})

CREATE (A01)-[:MOVE_TO {Quantity: 750}]->(B01)
CREATE (A02)-[:MOVE_TO {Quantity: 500}]->(B01)
CREATE (A03)-[:MOVE_TO {Quantity: 750}]->(B02)
CREATE (A04)-[:MOVE_TO {Quantity: 500}]->(B02)
CREATE (A05)-[:MOVE_TO {Quantity: 100}]->(B03)
CREATE (A06)-[:MOVE_TO {Quantity: 200}]->(B03)
CREATE (A07)-[:MOVE_TO {Quantity: 50}]->(B04)
CREATE (A08)-[:MOVE_TO {Quantity: 450}]->(B04)

CREATE (B01)-[:MOVE_TO {Quantity: 400}]->(C01)
CREATE (B02)-[:MOVE_TO {Quantity: 600}]->(C01)
CREATE (B03)-[:MOVE_TO {Quantity: 100}]->(C02)
CREATE (B04)-[:MOVE_TO {Quantity: 200}]->(C02)

CREATE (C01)-[:MOVE_TO {Quantity: 500}]->(D01)
CREATE (C02)-[:MOVE_TO {Quantity: 200}]->(D01)

当前查询是这样:

MATCH (s:Store { Name:'D01' })
MATCH (s)<-[t:MOVE_TO]-()<-[r:MOVE_TO]-(supp)
WITH t.Quantity as total, collect(r) as movements
WITH total, movements, reduce(totalSupplier = 0, r IN movements | totalSupplier + r.Quantity) as supCount
UNWIND movements as movement
RETURN startNode(movement).Supplier as Supplier, round(100.0*movement.Quantity/supCount) as pct

我正在尝试使用递归关系,类似于以下内容:

I am trying to use recursive relationships, something along the lines of this:

MATCH (s)<-[t:MOVE_TO]-()<-[r:MOVE_TO*]-(supp)

但是,它为最终节点提供了多条路径,我需要汇总我认为每个节点上的库存.

however that gives multiple paths to the end node and I need to aggregate the inventory at each node I think.

推荐答案

此查询为符合问题描述的模型的任意图形生成正确的结果. (当Store x将商品移动到Store y时,假定Supplier所移动商品的百分比与Store x的百分比相同.)

This query generates the correct results for any arbitrary graph that conforms to the model described in the question. (When Store x moves merchandise to Store y, it is assumed that the Supplier percentages of the moved merchandise is the same as for Store x.)

但是,此解决方案并不只包含单个Cypher查询(因为这不可能).取而代之的是,它涉及多个查询,必须迭代其中一个查询,直到计算遍历Store节点的整个图为止.该迭代查询将清楚地告诉您何时停止迭代.需要其他Cypher查询来:准备进行迭代的图形,报告结束"节点的供应商百分比以及清理图形(以便将其还原为下面的步骤1之前的状态)

However, this solution does not consist of just a single Cypher query (since that may not be possible). Instead, it involves multiple queries, one of which must be iterated until the calculations cascade through the entire graph of Store nodes. That iterated query will clearly tell you when to stop iterating. The other Cypher queries are needed to: prepare the graph for iteration, report the Supplier percentages for the "end" node(s), and clean up the graph (so that it is restored to the way it was before step 1, below).

这些查询可能会进一步优化.

These queries could probably be further optimized.

以下是必需的步骤:

  1. 为迭代查询准备图形(为所有开始的Store节点初始化临时pcts数组).这包括创建具有包含所有供应商名称的数组的单例Suppliers节点.这用于建立临时pcts数组元素的顺序,并将这些元素映射回正确的供应商名称.

  1. Prepare the graph for the iterative query (initializes the temporary pcts array for all starting Store nodes). This includes the creation of a singleton Suppliers node that has an array with all the supplier names. This is used to establish the order of the elements of the temporary pcts arrays, and to map those elements back to the correct supplier name.

MATCH (store:Store)
WHERE HAS (store.Supplier)
WITH COLLECT(store) AS stores, COLLECT(DISTINCT store.Supplier) AS csup
CREATE (sups:Suppliers { names: csup })
WITH stores, sups
UNWIND stores AS store
SET store.pcts =
  EXTRACT(i IN RANGE(0,LENGTH(sups.names)-1,1) |
    CASE WHEN store.Supplier = sups.names[i] THEN 1.0 ELSE 0.0 END)
RETURN store.Name, store.Supplier, store.pcts;

以下是包含问题数据的结果:

Here is the result with the question's data:

+---------------------------------------------+
| store.Name | store.Supplier | store.pcts    |
+---------------------------------------------+
| "A01"      | "S1"           | [1.0,0.0,0.0] |
| "A02"      | "S1"           | [1.0,0.0,0.0] |
| "A03"      | "S2"           | [0.0,1.0,0.0] |
| "A04"      | "S3"           | [0.0,0.0,1.0] |
| "A05"      | "S1"           | [1.0,0.0,0.0] |
| "A06"      | "S1"           | [1.0,0.0,0.0] |
| "A07"      | "S2"           | [0.0,1.0,0.0] |
| "A08"      | "S3"           | [0.0,0.0,1.0] |
+---------------------------------------------+
8 rows
83 ms
Nodes created: 1
Properties set: 9

  • 迭代查询(重复运行,直到返回0行)

  • Iterative query (run repeatedly until 0 rows are returned)

    MATCH p=(s1:Store)-[m:MOVE_TO]->(s2:Store)
    WHERE HAS(s1.pcts) AND NOT HAS(s2.pcts)
    SET s2.pcts = EXTRACT(i IN RANGE(1,LENGTH(s1.pcts),1) | 0)
    WITH s2, COLLECT(p) AS ps
    WITH s2, ps, REDUCE(s=0, p IN ps | s + HEAD(RELATIONSHIPS(p)).Quantity) AS total
    FOREACH(p IN ps |
      SET HEAD(RELATIONSHIPS(p)).pcts = EXTRACT(parentPct IN HEAD(NODES(p)).pcts | parentPct * HEAD(RELATIONSHIPS(p)).Quantity / total)
    )
    FOREACH(p IN ps |
      SET s2.pcts = EXTRACT(i IN RANGE(0,LENGTH(s2.pcts)-1,1) | s2.pcts[i] + HEAD(RELATIONSHIPS(p)).pcts[i])
    )
    RETURN s2.Name, s2.pcts, total, EXTRACT(p IN ps | HEAD(RELATIONSHIPS(p)).pcts) AS rel_pcts;
    

    迭代1结果:

    +-----------------------------------------------------------------------------------------------+
    | s2.Name | s2.pcts       | total | rel_pcts                                                    |
    +-----------------------------------------------------------------------------------------------+
    | "B04"   | [0.0,0.1,0.9] | 500   | [[0.0,0.1,0.0],[0.0,0.0,0.9]]                               |
    | "B01"   | [1.0,0.0,0.0] | 1250  | [[0.6,0.0,0.0],[0.4,0.0,0.0]]                               |
    | "B03"   | [1.0,0.0,0.0] | 300   | [[0.3333333333333333,0.0,0.0],[0.6666666666666666,0.0,0.0]] |
    | "B02"   | [0.0,0.6,0.4] | 1250  | [[0.0,0.6,0.0],[0.0,0.0,0.4]]                               |
    +-----------------------------------------------------------------------------------------------+
    4 rows
    288 ms
    Properties set: 24
    

    迭代2结果:

    +-------------------------------------------------------------------------------------------------------------------------------+
    | s2.Name | s2.pcts                                      | total | rel_pcts                                                     |
    +-------------------------------------------------------------------------------------------------------------------------------+
    | "C02"   | [0.3333333333333333,0.06666666666666667,0.6] | 300   | [[0.3333333333333333,0.0,0.0],[0.0,0.06666666666666667,0.6]] |
    | "C01"   | [0.4,0.36,0.24]                              | 1000  | [[0.4,0.0,0.0],[0.0,0.36,0.24]]                              |
    +-------------------------------------------------------------------------------------------------------------------------------+
    2 rows
    193 ms
    Properties set: 12
    

    迭代3的结果:

    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | s2.Name | s2.pcts                                                       | total | rel_pcts                                                                                                                    |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | "D01"   | [0.38095238095238093,0.27619047619047615,0.34285714285714286] | 700   | [[0.2857142857142857,0.2571428571428571,0.17142857142857143],[0.09523809523809522,0.01904761904761905,0.17142857142857143]] |
    +---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row
    40 ms
    Properties set: 6
    

    迭代4结果:

    +--------------------------------------+
    | s2.Name | s2.pcts | total | rel_pcts |
    +--------------------------------------+
    +--------------------------------------+
    0 rows
    69 ms
    

  • 列出结束的Store节点的非零Supplier百分比.

  • List the non-zero Supplier percentages for the ending Store node(s).

    MATCH (store:Store), (sups:Suppliers)
    WHERE NOT (store:Store)-[:MOVE_TO]->(:Store) AND HAS(store.pcts)
    RETURN store.Name, [i IN RANGE(0,LENGTH(sups.names)-1,1) WHERE store.pcts[i] > 0 | {supplier: sups.names[i], pct: store.pcts[i] * 100}] AS pcts;
    

    结果:

    +----------------------------------------------------------------------------------------------------------------------------------+
    | store.Name | pcts                                                                                                                |
    +----------------------------------------------------------------------------------------------------------------------------------+
    | "D01"      | [{supplier=S1, pct=38.095238095238095},{supplier=S2, pct=27.619047619047617},{supplier=S3, pct=34.285714285714285}] |
    +----------------------------------------------------------------------------------------------------------------------------------+
    1 row
    293 ms
    

  • 清理(删除所有临时pcts道具和Suppliers节点).

  • Clean up (remove all the temporary pcts props and the Suppliers node).

    MATCH (s:Store), (sups:Suppliers)
    OPTIONAL MATCH (s)-[m:MOVE_TO]-()
    REMOVE m.pcts, s.pcts
    DELETE sups;
    

    结果:

    0 rows
    203 ms
    +-------------------+
    | No data returned. |
    +-------------------+
    Properties set: 29
    Nodes deleted: 1
    

  • 这篇关于带有子图聚合的递归查询(任意深度)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆