获取neo4j中每个组的前n条记录 [英] Getting top n records for each group in neo4j
问题描述
我需要对neo4j数据库中的数据进行分组,然后过滤掉每个组中除顶部n
记录以外的所有内容.
I need to group the data from a neo4j database and then to filter out everything except the top n
records of every group.
示例:
我有两种节点类型:订单和商品.它们之间存在添加"关系. 添加"关系具有时间戳属性.我想知道的是(对于每篇文章)在订单中的前两篇文章中有多少次.我尝试的是以下方法:
I have two node types : Order and Article. Between them there is an "ADDED" relationship. "ADDED" relationship has a timestamp property. What I want to know (for every article) is how many times it was among the first two articles added to an order. What I tried is the following approach:
-
获取所有订单-[ADDED]-文章
get all the Order-[ADDED]-Article
将步骤1中的结果按订单ID作为第一个排序关键字,然后按ADDED关系的时间戳作为第二个排序关键字;
sort the result from step 1 by order id as first sorting key and then by timestamp of ADDED relationship as second sorting key;
对于第2步中代表一个顺序的每个子组,仅保留前2行;
for every subgroup from step 2 representing one order, keep only the top 2 rows;
在步骤3的输出中计算不同的文章ID;
Count distinct article ids in the output of step 3;
我的问题是我陷入了第3步.是否可以为代表订单的每个子组获取前2行?
My problem is that I got stuck at step 3. Is it possible to get top 2 rows for every subgroup representing an order?
谢谢
提比留
推荐答案
尝试
MATCH (o:Order)-[r:ADDED]->(a:Article)
WITH o, r, a
ORDER BY o.oid, r.t
WITH o, COLLECT(a)[..2] AS topArticlesByOrder UNWIND topArticlesByOrder AS a
RETURN a.aid AS articleId, COUNT(*) AS count
结果看起来像
articleId count
8 6
2 2
4 5
7 2
3 3
6 5
0 7
在此示例图使用
FOREACH(opar IN RANGE(1,15) |
MERGE (o:Order {oid:opar})
FOREACH(apar IN RANGE(1,5) |
MERGE (a:Article {aid:TOINT(RAND()*10)})
CREATE o-[:ADDED {t:timestamp() - TOINT(RAND()*1000)}]->a
)
)
这篇关于获取neo4j中每个组的前n条记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!