在Java中从Cypher查询中检索结果的速度很慢-Neo4j 2.0 [英] Slow results retrieval from Cypher query in Java - Neo4j 2.0

查看:167
本文介绍了在Java中从Cypher查询中检索结果的速度很慢-Neo4j 2.0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我从Java的Cypher查询执行中获得结果时,使用ResourceIterator<Node>的结果检索速度出乎意料地缓慢. next()命令平均需要156毫秒,标准偏差为385! 这是预期的行为,还是我做错了什么?有人可以建议一种更有效的方法来实现同一目标吗?

I'm experiencing surprisingly slow retrieval of results with ResourceIterator<Node> when I get results from Cypher query execution in Java. next() command takes on average 156ms, with standard deviation of 385! Is this behavior expected, or am I doing something wrong? Can anybody suggest a more efficient way of achieving the same thing?

我具有以下图形布局,其中点"节点具有与其他点的LinksTo关系:

I have the following graph layout, where Point nodes have LinksTo relations to other points:

节点:点
属性:
- idPoint (此属性的新型样式唯一约束)
- x (此属性上的新型样式索引)
- y (此属性上的新型样式索引)

Node:Point
Properties:
- idPoint (new style schema unique constraint on this property)
- x (new style schema index on this property)
- y (new style schema index on this property)

关系:链接至
属性:
- idLink
-长度
(...关系甚至在我的问题中都不起作用...)

Relation:LinksTo
Properties:
- idLink
- length
(...relations don't even play a role in my question...)

图形统计信息:
-节点数:890,000
-关系数:910,000

Graph statistics:
- # of nodes: 890,000
- # of relations: 910,000

(在Ubuntu上将Neo4j 2.0.0与Oracle Java 7一起使用稳定)
(基本上,此代码在给定点周围60x60的正方形中搜索节点(点).)

GraphDatabaseService graphDB = new GraphDatabaseFactory ( ).newEmbeddedDatabase ("points_db");

ExecutionEngine engine = new ExecutionEngine (graphDB);

for (Coordinate c : coords) // coords holds 500 different coordinates
{
    int size = 30;
    int xMin = c.x - size;
    int xMax = c.x + size;
    int yMin = c.y - size;
    int yMax = c.y + size;

    String query = "MATCH (n:POINT) " +
                     "  WHERE n.x > " + xMin +
                     "    AND n.x < " + xMax +
                     "    AND n.y > " + yMin +
                     "    AND n.y < " + yMax +
                     "RETURN n AS neighbour";

    ExecutionResult result = engine.execute (query); // command1

    ResourceIterator<Node> ri = result.columnAs ("neighbour"); // command2

    while (ri.hasNext ( ))
    {
        Node n = ri.next ( ); // command3
        // ... some code ...
    }
}

测量

command1平均执行时间:7.5毫秒
command2平均执行时间:< 1 ms
command3平均执行时间:156毫秒(标准偏差为358)

(通过500次迭代(不同坐标)进行的测量,每次迭代平均发现6个点.这些测量是可重复的.)

Measurements

command1 average execution time: 7.5 ms
command2 average execution time: <1 ms
command3 average execution time: 156 ms (with 358 standard deviation)

(Measurements taken with 500 iterations(different coordinates) and on average 6 points are found in each iteration. Measurements are repeatable.)

(在Ubuntu上将Neo4j 2.0.0与Oracle Java 7一起使用稳定)
(基本上,此代码在给定点周围60x60的正方形中搜索节点(点).)

GraphDatabaseService graphDB = new GraphDatabaseFactory ( ).newEmbeddedDatabase ("points_db");

ExecutionEngine engine = new ExecutionEngine (graphDB);
Map<String, Object> params = new HashMap<> ( );

int size = 30;
String query = "MATCH (n:POINT) " +
               "  WHERE n.x > {xMin}" +
               "    AND n.x < {xMax}" +
               "    AND n.y > {yMin}" +
               "    AND n.y < {yMax}" +
               "  RETURN n AS neighbour";

for (Coordinate c : coords) // coords holds 500 different coordinates
{
    params.put ("xMin", (int) c.x - size);
    params.put ("xMax", (int) c.x + size);
    params.put ("yMin", (int) c.y - size);
    params.put ("yMax", (int) c.y + size);

    ExecutionResult result = engine.execute (query, params); // command1

    ResourceIterator<Node> ri = result.columnAs ("neighbour"); // command2

    while (ri.hasNext ( ))
    {
        Node n = ri.next ( ); // command3
        // ... some code ...
    }
}

测量

command1平均执行时间:1.7毫秒
command2平均执行时间:< 1 ms
command3平均执行时间:112毫秒(标准偏差为270)
(通过500次迭代(不同坐标)进行的测量,每次迭代平均发现6个点.这些测量是可重复的.)

Measurements

command1 average execution time: 1.7 ms
command2 average execution time: <1 ms
command3 average execution time: 112 ms (with 270 standard deviation)
(Measurements taken with 500 iterations(different coordinates) and on average 6 points are found in each iteration. Measurements are repeatable.)

推荐答案

您要做的不是图形查询,而是整个数据库的范围扫描.

What you're doing is not a graph query but a range scan over the whole database.

因此,它必须提取所有节点,并对每个节点进行比较.

So it has to pull in all the nodes and for each of them doe your comparisons.

通常,您可以通过将节点放入一棵树(r-tree)中来解决此问题,该树将几何图形编码为二维树结构,然后仅需log(levels)复杂度即可访问所需的任何形状.

You usually solve this by putting your nodes into a tree (r-tree) that encodes the geometry into a two-dimensional tree-structure and then you can access whatever shape you need in only log(levels) complexity.

查看有关此主题的Neo4j空间演示文稿:

Check out the presentations about Neo4j spatial about this topic:

http://neo4j.org/develop/spatial

您还强制Neo4j重新解析并重新构建每个节点的查询(500次). 我同意Luanne的参数化要求,因此您的查询应如下所示. 您还应该在for-loop之前拉动它:

You also force Neo4j to re-parse and re-build the query for each of your nodes (500 times). I agree with Luanne on parametrization, so your query should look like this. You should pull this also before the for-loop:

String query = "MATCH (n:POINT) " +
                 "  WHERE n.x > {xMin}" +
                 "    AND n.x < {xMax}" +
                 "    AND n.y > {yMin}" +
                 "    AND n.y < {yMax}" +
                 "  RETURN n AS neighbour";

ExecutionResult result = engine.execute (query,
          map("xMin",xmMin,"xMax",xMax,"yMin",yMin,"yMax",yMax)); // query + params

....

这篇关于在Java中从Cypher查询中检索结果的速度很慢-Neo4j 2.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆