Cypher SORT性能 [英] Cypher SORT performance

查看:99
本文介绍了Cypher SORT性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试完成一项非常普通的任务.我在Neo4J数据库中有大量的数据集,并且想从RESTful Web服务中以25个节点的块返回数据.我的模型很简单:

(:Tenant {Hash:''})-[:owns]->(:Asset {Hash:'', Name:''})

两个标签的Hash属性都有唯一的约束.

如果我想获得第101个数据页,我的密码查询将如下所示:

MATCH (:Tenant {Hash:'foo'})-[:owns]->(a:Asset)
RETURN a
ORDER BY a.Hash
SKIP 2500
LIMIT 25

我的数据集由一个租户组成,资产约有75K.上面的查询需要大约30(!)秒才能完成.我还注意到,数据越先进(即SKIP越高),查询返回所需的时间就越长.

我很快就发现我的性能问题的罪魁祸首是ORDER BY a.Hash.当我删除它时,查询将返回不到一秒的结果.实际上,这真是令人惊讶,因为我希望索引本身也会被排序.

很显然,为了实现合理的分页,我必须具有一致的排序顺序.

  • 执行此查询的任何提示吗?
  • 关于分页的替代建议?我可以看到添加了专用的页面节点,但这将变得难以维护.
  • 无论如何,默认排序顺序是什么?它是否一致?

解决方案

嘿,@ GeoffreyBraaf在本周找到了一些时间来研究您的问题,您是对的,有一些实现问题使此过程变得不必要地缓慢.

我使用了Timmy的建议来实现Java版本,该版本在30毫秒内完成. Cypher版本花费了100秒.在Cypher中执行top-n select的工作使它大大提高了600倍.因此,Cypher现在对该查询花费了大约150ms.

请参阅: https://gist.github.com/jexp/9954509

该作品已在2.0版维护中合并,并将作为2.0.2版的一部分发行

请参阅: https://github.com/neo4j/neo4j/pull/2230

I'm trying to accomplish a pretty common task. I have a substantial dataset in a Neo4J database and, from a RESTful web service, i want to return the data in chunks of 25 nodes. My model is quite simple:

(:Tenant {Hash:''})-[:owns]->(:Asset {Hash:'', Name:''})

I have unique constraints on the Hash properties on both labels.

If i wanted to obtain the 101th data page, my cypher query would look like this:

MATCH (:Tenant {Hash:'foo'})-[:owns]->(a:Asset)
RETURN a
ORDER BY a.Hash
SKIP 2500
LIMIT 25

My dataset consists of a single tenant, with ~75K assets. The above query takes ~30(!) seconds to complete. I also notice is that the further i advance in the data (ie. higher SKIP) the longer it takes for the query to return.

I quickly figured out that the culprit of my performance issues is the ORDER BY a.Hash. When i remove it, the query returns with sub-second results. This is actually quite a surprise, as i'd expect the index itself to also be ordered.

Obviously, in order to implement sensible pagination, i must have a consistent sort order.

  • Any tips on making this query perform?
  • Alternative suggestions for paging? I can see adding dedicated page nodes, but that will become difficult to maintain.
  • What is the default sort order anyway, and is it consistent?

解决方案

Hey @GeoffreyBraaf found some time this week to look at your issue, you are right, there were some implementation issue that made this unnecessarily slow.

I used Timmy's suggestion to implement a Java version which finished in 30ms. The Cypher version took 100 seconds. Working on implementation of the top-n select in Cypher improved it massively by a factor of 600. So Cypher now takes about 150ms for that query.

See: https://gist.github.com/jexp/9954509

The work is already merged in 2.0-maint and will be released as part of 2.0.2

See: https://github.com/neo4j/neo4j/pull/2230

这篇关于Cypher SORT性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆