获取neo4j中每个组的前n条记录 [英] Getting top n records for each group in neo4j

查看:194
本文介绍了获取neo4j中每个组的前n条记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对neo4j数据库中的数据进行分组,然后过滤掉每个组中除顶部n记录以外的所有内容.

I need to group the data from a neo4j database and then to filter out everything except the top n records of every group.

示例:

我有两种节点类型:订单和商品.它们之间存在添加"关系. 添加"关系具有时间戳属性.我想知道的是(对于每篇文章)在订单中的前两篇文章中有多少次.我尝试的是以下方法:

I have two node types : Order and Article. Between them there is an "ADDED" relationship. "ADDED" relationship has a timestamp property. What I want to know (for every article) is how many times it was among the first two articles added to an order. What I tried is the following approach:

  1. 获取所有订单-[ADDED]-文章

  1. get all the Order-[ADDED]-Article

将步骤1中的结果按订单ID作为第一个排序关键字,然后按ADDED关系的时间戳作为第二个排序关键字;

sort the result from step 1 by order id as first sorting key and then by timestamp of ADDED relationship as second sorting key;

对于第2步中代表一个顺序的每个子组,仅保留前2行;

for every subgroup from step 2 representing one order, keep only the top 2 rows;

在步骤3的输出中计算不同的文章ID;

Count distinct article ids in the output of step 3;

我的问题是我陷入了第3步.是否可以为代表订单的每个子组获取前2行?

My problem is that I got stuck at step 3. Is it possible to get top 2 rows for every subgroup representing an order?

谢谢

提比留

推荐答案

尝试

MATCH (o:Order)-[r:ADDED]->(a:Article)
WITH o, r, a
ORDER BY o.oid, r.t
WITH o, COLLECT(a)[..2] AS topArticlesByOrder UNWIND topArticlesByOrder AS a
RETURN a.aid AS articleId, COUNT(*) AS count

结果看起来像

articleId    count
   8           6
   2           2
   4           5
   7           2
   3           3
   6           5
   0           7

在此示例图使用

FOREACH(opar IN RANGE(1,15) |
    MERGE (o:Order {oid:opar})
    FOREACH(apar IN RANGE(1,5) |
        MERGE (a:Article {aid:TOINT(RAND()*10)})
        CREATE o-[:ADDED {t:timestamp() - TOINT(RAND()*1000)}]->a
    )
)

这篇关于获取neo4j中每个组的前n条记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆