Neo4j在执行递归查询时比MySQL慢 [英] Neo4j slower than MySQL in performing recursive query

查看:375
本文介绍了Neo4j在执行递归查询时比MySQL慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在执行递归查询时,我想比较Neo4j( 3.1版)和MySQL.因此,我在MySQL数据库中创建了两个表-CustomerCustomerFriend.

I would like to compare Neo4j(ver. 3.1) and MySQL in performing recursive queries. Therefore I created two tables in MySQL database - Customer and CustomerFriend.

第二个表由CustomerIDFriendID列组成,它们都指向Customer表中的CustomerID列.在Neo4j中创建了相应的实体:

Second table consists of CustomerID and FriendID columns, both of them point to CustomerID column in Customer table. In Neo4j were created corresponding entities:

Customer节点和FRIEND_OF关系(c:Customer)-[f:FRIEND_OF]->(cc:Customer).数据库填充了相同的数据: 100000个客户,每个客户有100个关系. 在以下查询中执行:

Customer nodes and FRIEND_OF relations (c:Customer)-[f:FRIEND_OF]->(cc:Customer). Databases are filled with the same data: 100000 Customers, each Customer has 100 relations. Executed below queries:

MySQL( 60s )

SELECT distinct cf4.FriendID FROM customerfriend cf1
join customerfriend cf2 on cf1.FriendID = cf2.CustomerID
join customerfriend cf3 on cf2.FriendID = cf3.CustomerID
join customerfriend cf4 on cf3.FriendID = cf4.CustomerID
where cf1.CustomerID =99;

Neo4j( 240秒)

match (c:Customer{CustomerID:99})-[:FRIEND_OF*4]->(cc:Customer)
return distinct cc.CustomerID;

查询是从简单的Java应用程序运行的,该Java应用程序仅连接到数据库(使用可用的连接器),运行查询并衡量执行时间.

Queries are run from simple Java app, which just connect to database (using available connectors), run queries, and measure execution times.

经过测量的时间清楚地表明,Neo4j在执行上述查询时比MySQL(MySQL 60s,Neo4j 240s)慢.我已经测试了上述查询中每个客户的50个关系,并且获得了相同的结果(MySQL Neo 7s 更快).

Measured times clearly indicate that Neo4j is slower in performing above queries than MySQL (MySQL 60s, Neo4j 240s). I have tested above queries for 50 relations per Customer and I achieved same results (MySQL 7s faster than Neo4j 17s ).

我阅读了一些有关在Neo4j中执行递归查询的文章,这些文章表明Neo4j对于这种类型的查询应该比MySQL更好地管理.这就是为什么我开始怀疑自己是在做错什么还是 执行时间正确( ?? ).

I read some articles about performing recursive queries in Neo4j which indicate that Neo4j should manage better for this type of queries than MySQL. That's why I have started wondering if I'm doing something wrong or execution times are proper (??).

我想知道Neo4j中是否存在任何调整系统性能的可能性.如果使用MySQL,我将innodb_buffer_pool_size设置为3g,这会影响更好的查询性能(更短的执行时间).

I'm wondering if in Neo4j exists any possibilities to tune system performance. In case of MySQL I set up innodb_buffer_pool_size to 3g which affected better query performance(shorter execution time).

-------------------------------- 编辑 ------- --------------------

--------------------------------EDIT---------------------------

我考虑了以下建议,将我的Noe4j查询重写为新形式:

I have considered below suggestions to rewrite my Noe4j query to new form:

match (c:Customer{CustomerID:99})-[:FRIEND_OF]->(c1)-[:FRIEND_OF]->(c2)
with distinct c2
match (c2)-[:FRIEND_OF]->(c3)
with distinct c3
match (c3)-[:FRIEND_OF]->(cc:Customer)
with distinct cc
return cc.CustomerID;

并获得了更好的查询时间: 40s

And achieved better query time: 40s

对于MySQL,我已经找到了优化先前查询的方法,类似于Neo4j查询优化的想法:

In case of MySQL I have figured out way to optimise previous query, similar to idea of Neo4j query optimisation:

select distinct FriendID as depth4
from customerfriend
where CustomerID in
(select distinct FriendID as depth3
from customerfriend
where CustomerID in
(select distinct FriendID as depth2
from customerfriend
where CustomerID in
(select distinct FriendID as depth
from customerfriend
where CustomerID =99
)));

此查询的执行时间为 24秒

Neo4j仍然比MySQL更差...

Neo4j still worse than MySQL...

推荐答案

您可以进行一些小的修改,使neo4j快50%,或者甚至更快,请使用此博客文章底部显示的位组跳舞= > https://maxdemarzi.com/2013/12/31/the-power-of-open-source-software/

You can make a small modification to make neo4j about 50% faster, or for even more speed, use the bitset dance shown on the bottom of this blog post => https://maxdemarzi.com/2013/12/31/the-power-of-open-source-software/

更新:

我继续为您建立了自定义程序.

I went ahead and built a custom procedure for you.

您可以在 https://github.com/maxdemarzi/distinct_network

在具有10002045个关联的笔记本电脑上需要2.9秒.

It takes 2.9 seconds on my laptop with 10002045 relationships.

第二次更新:

撰写有关该主题的博客文章: https://maxdemarzi.com/2017/02/06/neo4j-is-faster-than-mysql-in-performing-recursive-query/

Wrote a blog post on the subject: https://maxdemarzi.com/2017/02/06/neo4j-is-faster-than-mysql-in-performing-recursive-query/

这篇关于Neo4j在执行递归查询时比MySQL慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆