Neo4j Cypher查询性能优化 [英] Neo4j Cypher query performance optimization
问题描述
我有以下Neo4j Cypher查询
I have the following Neo4j Cypher query
MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision)
WHERE dg.id = 1
MATCH (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(filterCharacteristic4:Characteristic)
WHERE filterCharacteristic4.id = 4
WITH relationshipValueRel4, childD, dg
WHERE (ANY (id IN [2,3]
WHERE id IN relationshipValueRel4.optionIds ))
WITH childD, dg
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion)
WHERE c.id IN [2, 3]
WITH childD, dg, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes
WITH childD , dg , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes
ORDER BY weight DESC
SKIP 0 LIMIT 10
WITH * MATCH (childD)-[ru:CREATED_BY]->(u:User) OPTIONAL MATCH (childD)-[rup:UPDATED_BY]->(up:User)
RETURN ru, u, rup, up, childD AS decision, weight, totalVotes,
[ (dg)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) | {entityId: toInt(entity.id), types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups,
[ (dg)<-[:DEFINED_BY]-(c1)<-[vg1:HAS_VOTE_ON]-(childD) | {criterionId: toInt(c1.id), weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria, [ (dg)<-[:DEFINED_BY]-(ch1:Characteristic)<-[v1:HAS_VALUE_ON]-(childD) WHERE NOT ((ch1)<-[:DEPENDS_ON]-()) | {characteristicId: toInt(ch1.id), optionIds: v1.optionIds, valueIds: v1.valueIds, value: v1.value, available: v1.available, totalHistoryValues: v1.totalHistoryValues, totalFlags: v1.totalFlags, description: v1.description, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics
我对查询执行的性能不满意.
I'm not sutisfied with the performance of this query execution.
这是PROFILE输出:
This is PROFILE output:
Cypher version: CYPHER 3.3, planner: COST, runtime: INTERPRETED. 3296130 total db hits in 2936 ms
是否有机会优化此查询性能?
Is there any chance to optimize this query performance ?
推荐答案
在没有数据集,对图形的了解以及要搜索的内容的情况下,优化此查询会有些困难.
It will be a little hard to optimize this query without a dataset, knowledge of you graph and what you are searching to do.
性能取决于:
- 查询本身
- 架构(索引和约束)
- 图形建模
- Neo4j配置
- 硬件
即使对我来说,它可以写成更具可读性的状态,查询也没有大问题(例如:一个大的match
,match
中where
子句的糖语法,替换any
,然后按or
,...),但不会更改查询计划.
There is no big problem on your query, even if it can be written into a more readable state for me (ex: one big match
, sugar syntax on where
clause in the match
, replace the any
by an or
, ...) , but it will not change the query plan.
请确保对此查询使用查询参数,以避免每次都重新计算该长查询的查询计划.
Be sure to use query parameters with this query to avoid to recalculate the query plan of this long query everytimes.
您的查询将大部分时间传递给(childD)-[relationshipValueRel4:HAS_VALUE_ON]-(:Characteristic)
+它的where
子句(即1.5M * 2 dbhits).
因此,可以通过创建诸如HAS_VALUE_ON_WITH_OPTID_1
,HAS_VALUE_ON_WITH_OPTID_2
...
Your query pass most of its times into (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(:Characteristic)
+ the where
clause on it (ie. 1.5M * 2 dbhits).
So a solution can be to change the model by creating some relationships like that : HAS_VALUE_ON_WITH_OPTID_1
, HAS_VALUE_ON_WITH_OPTID_2
...
这篇关于Neo4j Cypher查询性能优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!