在Java中从Cypher查询中检索结果的速度很慢-Neo4j 2.0 [英] Slow results retrieval from Cypher query in Java - Neo4j 2.0
问题描述
当我从Java的Cypher查询执行中获得结果时,使用ResourceIterator<Node>
的结果检索速度出乎意料地缓慢. next()
命令平均需要156毫秒,标准偏差为385! 这是预期的行为,还是我做错了什么?有人可以建议一种更有效的方法来实现同一目标吗?
I'm experiencing surprisingly slow retrieval of results with ResourceIterator<Node>
when I get results from Cypher query execution in Java. next()
command takes on average 156ms, with standard deviation of 385! Is this behavior expected, or am I doing something wrong? Can anybody suggest a more efficient way of achieving the same thing?
我具有以下图形布局,其中点"节点具有与其他点的LinksTo关系:
I have the following graph layout, where Point nodes have LinksTo relations to other points:
节点:点
属性:
- idPoint (此属性的新型样式唯一约束)
- x (此属性上的新型样式索引)
- y (此属性上的新型样式索引)
Node:Point
Properties:
- idPoint (new style schema unique constraint on this property)
- x (new style schema index on this property)
- y (new style schema index on this property)
关系:链接至
属性:
- idLink
-长度
(...关系甚至在我的问题中都不起作用...)
Relation:LinksTo
Properties:
- idLink
- length
(...relations don't even play a role in my question...)
图形统计信息:
-节点数:890,000
-关系数:910,000
Graph statistics:
- # of nodes: 890,000
- # of relations: 910,000
(在Ubuntu上将Neo4j 2.0.0与Oracle Java 7一起使用稳定)
(基本上,此代码在给定点周围60x60的正方形中搜索节点(点).)
GraphDatabaseService graphDB = new GraphDatabaseFactory ( ).newEmbeddedDatabase ("points_db");
ExecutionEngine engine = new ExecutionEngine (graphDB);
for (Coordinate c : coords) // coords holds 500 different coordinates
{
int size = 30;
int xMin = c.x - size;
int xMax = c.x + size;
int yMin = c.y - size;
int yMax = c.y + size;
String query = "MATCH (n:POINT) " +
" WHERE n.x > " + xMin +
" AND n.x < " + xMax +
" AND n.y > " + yMin +
" AND n.y < " + yMax +
"RETURN n AS neighbour";
ExecutionResult result = engine.execute (query); // command1
ResourceIterator<Node> ri = result.columnAs ("neighbour"); // command2
while (ri.hasNext ( ))
{
Node n = ri.next ( ); // command3
// ... some code ...
}
}
测量
command1平均执行时间:7.5毫秒
command2平均执行时间:< 1 ms
command3平均执行时间:156毫秒(标准偏差为358)
(通过500次迭代(不同坐标)进行的测量,每次迭代平均发现6个点.这些测量是可重复的.)
Measurements
command1 average execution time: 7.5 ms
command2 average execution time: <1 ms
command3 average execution time: 156 ms (with 358 standard deviation)
(Measurements taken with 500 iterations(different coordinates) and on average 6 points are found in each iteration. Measurements are repeatable.)
(在Ubuntu上将Neo4j 2.0.0与Oracle Java 7一起使用稳定)
(基本上,此代码在给定点周围60x60的正方形中搜索节点(点).)
GraphDatabaseService graphDB = new GraphDatabaseFactory ( ).newEmbeddedDatabase ("points_db");
ExecutionEngine engine = new ExecutionEngine (graphDB);
Map<String, Object> params = new HashMap<> ( );
int size = 30;
String query = "MATCH (n:POINT) " +
" WHERE n.x > {xMin}" +
" AND n.x < {xMax}" +
" AND n.y > {yMin}" +
" AND n.y < {yMax}" +
" RETURN n AS neighbour";
for (Coordinate c : coords) // coords holds 500 different coordinates
{
params.put ("xMin", (int) c.x - size);
params.put ("xMax", (int) c.x + size);
params.put ("yMin", (int) c.y - size);
params.put ("yMax", (int) c.y + size);
ExecutionResult result = engine.execute (query, params); // command1
ResourceIterator<Node> ri = result.columnAs ("neighbour"); // command2
while (ri.hasNext ( ))
{
Node n = ri.next ( ); // command3
// ... some code ...
}
}
测量
command1平均执行时间:1.7毫秒
command2平均执行时间:< 1 ms
command3平均执行时间:112毫秒(标准偏差为270)
(通过500次迭代(不同坐标)进行的测量,每次迭代平均发现6个点.这些测量是可重复的.)
Measurements
command1 average execution time: 1.7 ms
command2 average execution time: <1 ms
command3 average execution time: 112 ms (with 270 standard deviation)
(Measurements taken with 500 iterations(different coordinates) and on average 6 points are found in each iteration. Measurements are repeatable.)
推荐答案
您要做的不是图形查询,而是整个数据库的范围扫描.
What you're doing is not a graph query but a range scan over the whole database.
因此,它必须提取所有节点,并对每个节点进行比较.
So it has to pull in all the nodes and for each of them doe your comparisons.
通常,您可以通过将节点放入一棵树(r-tree)中来解决此问题,该树将几何图形编码为二维树结构,然后仅需log(levels)
复杂度即可访问所需的任何形状.
You usually solve this by putting your nodes into a tree (r-tree) that encodes the geometry into a two-dimensional tree-structure and then you can access whatever shape you need in only log(levels)
complexity.
查看有关此主题的Neo4j空间演示文稿:
Check out the presentations about Neo4j spatial about this topic:
http://neo4j.org/develop/spatial
您还强制Neo4j重新解析并重新构建每个节点的查询(500次).
我同意Luanne的参数化要求,因此您的查询应如下所示.
您还应该在for-loop
之前拉动它:
You also force Neo4j to re-parse and re-build the query for each of your nodes (500 times).
I agree with Luanne on parametrization, so your query should look like this.
You should pull this also before the for-loop
:
String query = "MATCH (n:POINT) " +
" WHERE n.x > {xMin}" +
" AND n.x < {xMax}" +
" AND n.y > {yMin}" +
" AND n.y < {yMax}" +
" RETURN n AS neighbour";
ExecutionResult result = engine.execute (query,
map("xMin",xmMin,"xMax",xMax,"yMin",yMin,"yMax",yMax)); // query + params
....
这篇关于在Java中从Cypher查询中检索结果的速度很慢-Neo4j 2.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!