为什么allshortestpath这么慢? [英] why allshortestpath so slow?

查看:85
本文介绍了为什么allshortestpath这么慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用python和neo4j库创建了一些图形数据库.图有5万个节点和10万个关系.

I create some graph database with python and neo4j library. Graph have 50k nodes and 100k relationships.

如何创建节点:

CREATE (user:user {task_id: %s, id: %s, root: 1, private: 0})

如何建立关系:

 MATCH (root_user), (friend_user) WHERE root_user.id = %s
                                  AND root_user.task_id = %s  
                                  AND friend_user.id = %s
                                  AND friend_user.task_id = %s
                    CREATE (root_user)-[r: FRIEND_OF]->(friend_user) RETURN root_user, friend_user 

我如何搜索节点之间的所有路径:

How i search all path between nodes:

MATCH (start_user:user {id: %s, task_id: %s}), 
      (end_user:user {id: %s, task_id: %s}), 
      path = allShortestPaths((start_user)-[*..3]-(end_user)) RETURN path

在50k图表上缓慢滚动,大约30-60分钟.而且我不明白为什么.我尝试创建这样的索引:

Soo its very slow, around 30-60 min on 50k graph. And i cant understand why. I try to create index like this:

CREATE INDEX ON :user(id, task_id)

但没有帮助.你能帮助我吗?谢谢.

but its not help. Can you help me? Thanks.

推荐答案

永远不要生成包含N个基本相同的Cypher代码的细微变化的长Cypher查询.这非常慢,并且占用大量内存.

You should never generate a long Cypher query that contains N slight variations of essentially the same Cypher code. That is very slow and takes up a lot of memory.

相反,您应该将参数传递给很多更简单的Cypher查询.

Instead, you should be passing parameters to a much simpler Cypher query.

例如,在创建节点时,可以将data参数传递给以下Cypher代码:

For example, when creating your nodes, you could pass a data parameter to the following Cypher code:

UNWIND $data AS d
CREATE (user:user {task_id: d.taskId, id: d.id, root: 1, private: 0})

您传递的data参数值将是一个映射列表,每个映射将包含一个taskIdid. UNWIND子句将data列表展开"为单独的d映射.这样会更快.

The data parameter value that you pass would be a list of maps, and each map would contain a taskId and id. The UNWIND clause "unwinds" the data list into individual d maps. This would be much faster.

您需要使用关系创建代码来完成类似的工作.

Something similar needs to be done with your relationship-creation code.

此外,为了使用任何:user索引,您的MATCH子句必须在相关的节点模式中指定:user标签.否则,您将要求Cypher扫描所有节点,而不管标签如何,这样的处理将无法利用索引.例如,相关查询应以以下内容开头:

In addition, in order to use any of your :user indexes, your MATCH clause MUST specify the :user label in the relevant node patterns. Otherwise, you are asking Cypher to scan all nodes, regardless of label, and that kind of processing would not be able to take advantage of indexes. For example, the relevant query should start with:

MATCH (root_user:user), (friend_user:user)
...

这篇关于为什么allshortestpath这么慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆