使用Neo4j批量插入 [英] Batch Insertion with Neo4j

查看：160 发布时间：2018/5/25 17:40:17 java mysql graph neo4j batch-processing

本文介绍了使用Neo4j批量插入的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我从表中导入了23亿条关系，导入速度不是很快，每小时的速度达到5百万，这需要20天才能完成迁移。我听说过neo4j批量插入和批量插入实用程序。该工具通过从csv文件导入来做有趣的事情，但最新的代码是一些如何破碎和不运行。

我在neo4j中有大约100M的关系，我必须全部检查应该没有重复的关系。

如何在neo4j中加快速度

通过当前代码就像

  begin transaction 
 for 50K relationship 
为用户创建或获取用户节点
为用户创建或获取用户节点B 
检查是否存在关系如果不创建关系hsip 
结束事务
   
 
 我也阅读了以下内容： 
 
 
 
  在Neo4j上批量插入 
 
  如何加速Neo4j中的插入操作 
 
 
 
 
在关系的情况下，假设你有足够的存储空间，我会尽量不在导入阶段建立唯一的关系 - 现在我实际上也在导入SQL表〜 3mil记录，但我总是创建一个关系，不介意它是否重复。
 
 
 您可以稍后在导入后简单地执行密码查询，这会使独特这样的关系： 
 
 
  START n =节点（*）MATCH n  -  [：KNOW] -m 
 CREATE UNIQUE N  -  [：KNOW2] -m; 
  
和
  START r = rel（*）其中type（r）='KNOW'delete r; 
  
至少这是我现在的方法，运行后面的密码查询只需要几分钟。问题可能在于你真的有两百个节点时，密码查询可能会陷入内存错误（取决于你为neo4j引擎设置多少缓存）。
I am importing 2.3 Billion relationship from a table, The import is not very fast getting a speed on 5Million per hour that will take 20 days to complete the migration. I have heard about the neo4j batch insert and and batch insert utility. The utility do interesting stuff by importing from a csv file but the latest code is some how broken and not running.


I have about 100M relations in neo4j and I have to all check that there shall be no duplicate relationship.

How can I fast the things in neo4j

By current code is like
begin transaction
for 50K relationships
create or get user node for user A
create or get user node for user B
check there is relationship KNOW between A to B if not create the relationhsip
end transaction
I have also read the following:


Batch insert at Neo4j
How to speed up insertion in Neo4j

 解决方案 
in case of relationships, and supposing you have enough storage, i would try to not make unique relationships in the import phase - right now i'm actually also importing an SQL table with ~3mil records but i always create a relationship and don't mind whether it is duplicite or not.

you can later after the import simply do a cypher query which will craete unique relationships like this:
START n=node(*) MATCH n-[:KNOW]-m
CREATE UNIQUE n-[:KNOW2]-m;
and
START r=rel(*) where type(r)='KNOW' delete r;
at least this is my approach now and running the later cypher query takes just about minutes. problem could be when you really have bilions of nodes, than the cypher query might fall into an memory error (depends on how much cache you set up for the neo4j engine)

                        这篇关于使用Neo4j批量插入的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！
                        
                    

                    
                        查看全文

使用Neo4j批量插入 [英] Batch Insertion with Neo4j

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

使用Neo4j批量插入 [英] Batch Insertion with Neo4j

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭