优化无向图中路径受限的Neo4j Cypher路径查找 [英] Optimize Neo4j Cypher path finding with limited paths in an undirected graph

查看:94
本文介绍了优化无向图中路径受限的Neo4j Cypher路径查找的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为问题" Neo4j Cypher路径查找的后续操作在无向图中速度慢". Michael Hunger和Wes Freeman很好地帮助了我,但我未能将学到的技术用于需要返回路径的路径查找查询中.

As a follow-up from the question "Neo4j Cypher path finding slow in undirected graph". Michael Hunger and Wes Freeman kindly helped but I failed to adapt the techniques learned to path finding queries that should return the paths.

问题:

以下查询大约需要3秒钟,并从数据库返回13行(找到的路径).我发现它运行缓慢,想让它更快地执行,但是不知道如何优化它. (这当然是一个示例,但我发现其他类似的查询也很慢.)

The below query takes roughly 3s and returns 13 rows (the paths found) from a database. I find it slow and would like to have it execute faster but don't know how to optimize it. (This is an example of course but I find similar other queries slow too.)

START n=node:NodeIds('id:4000'), t=node:NodeIds('id:10778')   
MATCH path = (n)-[:ASSOCIATIVY_CONNECTION*1..3]-(t)   
RETURN nodes(path) AS Nodes

与个人资料数据相同:

neo4j-sh (0)$ profile START n=node:NodeIds('id:4000'), t=node:NodeIds('id:10778')    MATCH path = (n)-[:ASSOCIATIVY_CONNECTION*1..3]-(t)    RETURN nodes(path) AS Nodes;
==> +-------------------------------------------------------------------------------------------+
==> | Nodes                                                                                     |
==> +-------------------------------------------------------------------------------------------+
==> | [Node[3984]{Id:4000},Node[986]{Id:1001},Node[18536]{Id:18552},Node[10763]{Id:10778}]      |
==> | [Node[3984]{Id:4000},Node[1085]{Id:1100},Node[9955]{Id:9970},Node[10763]{Id:10778}]       |
==> | [Node[3984]{Id:4000},Node[133348]{Id:133364},Node[9955]{Id:9970},Node[10763]{Id:10778}]   |
==> | [Node[3984]{Id:4000},Node[111409]{Id:111425},Node[18536]{Id:18552},Node[10763]{Id:10778}] |
==> | [Node[3984]{Id:4000},Node[64464]{Id:64480},Node[18536]{Id:18552},Node[10763]{Id:10778}]   |
==> | [Node[3984]{Id:4000},Node[64464]{Id:64480},Node[9955]{Id:9970},Node[10763]{Id:10778}]     |
==> | [Node[3984]{Id:4000},Node[64464]{Id:64480},Node[10763]{Id:10778}]                         |
==> | [Node[3984]{Id:4000},Node[64464]{Id:64480},Node[64455]{Id:64471},Node[10763]{Id:10778}]   |
==> | [Node[3984]{Id:4000},Node[79152]{Id:79168},Node[18536]{Id:18552},Node[10763]{Id:10778}]   |
==> | [Node[3984]{Id:4000},Node[69190]{Id:69206},Node[18536]{Id:18552},Node[10763]{Id:10778}]   |
==> | [Node[3984]{Id:4000},Node[25893]{Id:25909},Node[18536]{Id:18552},Node[10763]{Id:10778}]   |
==> | [Node[3984]{Id:4000},Node[31683]{Id:31699},Node[18536]{Id:18552},Node[10763]{Id:10778}]   |
==> | [Node[3984]{Id:4000},Node[6965]{Id:6980},Node[18536]{Id:18552},Node[10763]{Id:10778}]     |
==> +-------------------------------------------------------------------------------------------+
==> 13 rows
==> 2824 ms
==> 
==> ColumnFilter(symKeys=["path", "n", "t", "  UNNAMED3", "Nodes"], returnItemNames=["Nodes"], _rows=13, _db_hits=0)
==> Extract(symKeys=["n", "t", "  UNNAMED3", "path"], exprKeys=["Nodes"], _rows=13, _db_hits=0)
==>   ExtractPath(name="path", patterns=["  UNNAMED3=n-[:ASSOCIATIVY_CONNECTION*1..3]-t"], _rows=13, _db_hits=0)
==>     PatternMatch(g="(n)-['  UNNAMED3']-(t)", _rows=13, _db_hits=0)
==>       Nodes(name="t", _rows=1, _db_hits=1)
==>         Nodes(name="n", _rows=1, _db_hits=1)
==>           ParameterPipe(_rows=1, _db_hits=0) 

设置:

Neo4j图形数据库具有165k个节点和266k个关系,其中所有关系都是无向的(双向的),并带有标签"ASSOCIATIVY_CONNECTION".没有节点连接到根节点.除了节点和关系之外,每个节点仅存储一个整数值(图形数据库不用于存储实际数据,而仅用于存储结构).

The Neo4j graph database has 165k nodes and 266k relationships where all the relationships are undirected (bidirectional) and have the label "ASSOCIATIVY_CONNECTION". None of the nodes are connected to the root node. Apart from the nodes and relationships only one integer value is stored for each node (the graph database is not used to store the actual data, but just for the structure).

此数据库的内存配置如下:

The memory configuration for this database is as following:

wrapper.java.initmemory=1024
wrapper.java.maxmemory=1024

neostore.nodestore.db.mapped_memory=225M
neostore.relationshipstore.db.mapped_memory=250M
neostore.propertystore.db.mapped_memory=290M
neostore.propertystore.db.strings.mapped_memory=330M
neostore.propertystore.db.arrays.mapped_memory=330M

数据集是通过以下Wikipedia文章之间的相互连接生成的图,可从此处下载.

The dataset is a graph generated by following interconnections between Wikipedia articles and is downloadable from here.

我从Neo4j.bat开始在Windows 8计算机上运行Neo4j 1.9.M05社区.我认为硬件不会成为问题,因为查询只会导致短暂的10%CPU峰值.有GB的可用RAM.

I run Neo4j 1.9.M05 community on a Windows 8 machine by starting from Neo4j.bat. I don't think hardware can be an issue as the query only causes a short 10% CPU spike. There are GBs of free RAM available.

非常感谢您提供有关如何使此查询运行更快的指示.

I'd be very thankful for pointers on how to make this query run faster.

在具有283k个节点和538k个关系的相同图形的稍有增强的版本中尝试了相同的查询.现在需要20秒!

tried the same query in a slightly enhanced version of the same graph with 283k nodes and 538k relationships. It now takes 20 seconds!

编辑2,增加内存限制: 根据Michael的建议,我将wrapper.java.initmemory和wrapper.java.maxmemory设置提升为8192(8GB).确实,它使运行Neo4j的java进程的内存占用增加到2,25GB,并且还提高了查询的性能:现在(在第三次运行之后)热身查询的时间约为1s. 我还将neo4j.properties配置文件中的内存设置分别提高到2GB,但是没有任何明显的效果. 为了使所有这些正常工作,我需要64b Java运行时(您可以轻松地为浏览器下载的默认运行时是32b版本),因此我下载了

Edit 2, increasing memory limits: As advised by Michael I upped the wrapper.java.initmemory and wrapper.java.maxmemory settings to 8192 (8GB). It indeed increased the memory footprint to 2,25GB of the java process running Neo4j and also it increased the performance of the query: now it's about 1s on warmed up queries (after the third run). I also upped the memory settings in the neo4j.properties config file to 2GB each but it doesn't have any noticeable effect. For all this to work I needed the 64b Java runtime (the default one you can easily download for your browser is a 32b version) so I downloaded the manual installer for it. After it's installed Neo4j will automatically start with it instead of the 32b version.

推荐答案

在Windows上运行时,请增加堆大小,因为MMIO直接内存是Windows上Java堆的一部分.

As you are running on windows please increase your heap sizes as MMIO direct memory is part of the java heap on Windows.

这篇关于优化无向图中路径受限的Neo4j Cypher路径查找的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆