导入关系的阶段出了什么问题? [英] What's going wrong with the phase of importing relationships?

查看:79
本文介绍了导入关系的阶段出了什么问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我终于克服了导入节点的阶段.现在,我正在尝试导入关系.可能存在1B关系.

I finally conquer the phase of importing nodes. Now I am trying to import relationships. There might be 1B relationships.

#!/bin/bash
cd /home/luning/neo4j-enterprise-2.2.0-RC01-unix/neo4j-enterprise-2.2.0-RC01/bin
users="/data/weibo/user-header.csv"
for i in /data/weibo/users/*
do
    users=$users,$i
done
edges=/data/weibo/edge-header.csv,/data/weibo/ego/000000_0
./neo4j-import --stacktrace --into ../data/weibo_bak.db --nodes:User $users --relationships:Follow $edges --delimiter TAB --quote \' --bad-tolerance 50000 --id-type STRING

但是总是说节点丢失.难以理解的是,为两个试验导入相同的文件,这给了我不同的丢失节点. 1.第一次

But there always says node missing. Unintelligibly, with importing same file for two trials, it gave me different missing node. 1. First Time

   source: /data/weibo/ego/000000_0:1807199
   startNode: 1587438071
   endNode: 2414878813
   type: Follow
 refering to missing node 1587438071
java.lang.RuntimeException: Too many bad entries, saw 50001 where last one was InputRelationship:
   source: /data/weibo/ego/000000_0:1807199
   startNode: 1587438071
   endNode: 2414878813
   type: Follow
 refering to missing node 1587438071
    at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
    at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
    at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:152)
    at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: Too many bad entries, saw 50001 where last one was InputRelationship:
   source: /data/weibo/ego/000000_0:1807199
   startNode: 1587438071
   endNode: 2414878813
   type: Follow
 refering to missing node 1587438071
    at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:47)
    at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:27)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.incrementCount(CalculateDenseNodesStep.java:79)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:56)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:32)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:96)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:87)
    at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)

2.第二次

source: /data/weibo/ego/000000_0:1844245
startNode: 3492922617
endNode: 1589699375
type: Follow
 refering to missing node 1589699375
java.lang.RuntimeException: Too many bad entries, saw 50001 where last one was InputRelationship:
   source: /data/weibo/ego/000000_0:1844245
   startNode: 3492922617
   endNode: 1589699375
   type: Follow
 refering to missing node 1589699375
    at org.neo4j.unsafe.impl.batchimport.staging.StageExecution.stillExecuting(StageExecution.java:63)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.anyStillExecuting(ExecutionSupervisor.java:79)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.finishAwareSleep(ExecutionSupervisor.java:102)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisor.supervise(ExecutionSupervisor.java:64)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutionSupervisors.superviseDynamicExecution(ExecutionSupervisors.java:65)
    at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.executeStages(ParallelBatchImporter.java:226)
    at org.neo4j.unsafe.impl.batchimport.ParallelBatchImporter.doImport(ParallelBatchImporter.java:152)
    at org.neo4j.tooling.ImportTool.main(ImportTool.java:263)
Caused by: org.neo4j.unsafe.impl.batchimport.input.InputException: Too many bad entries, saw 50001 where last one was InputRelationship:
   source: /data/weibo/ego/000000_0:1844245
   startNode: 3492922617
   endNode: 1589699375
   type: Follow
 refering to missing node 1589699375
    at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:47)
    at org.neo4j.unsafe.impl.batchimport.input.BadRelationshipsCollector.collect(BadRelationshipsCollector.java:27)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.incrementCount(CalculateDenseNodesStep.java:79)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:59)
    at org.neo4j.unsafe.impl.batchimport.CalculateDenseNodesStep.process(CalculateDenseNodesStep.java:32)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:96)
    at org.neo4j.unsafe.impl.batchimport.staging.ExecutorServiceStep$2.call(ExecutorServiceStep.java:87)
    at org.neo4j.unsafe.impl.batchimport.executor.DynamicTaskExecutor$Processor.run(DynamicTaskExecutor.java:217)

但是对于这两个节点1587438071和2765561213,我可以确保它们都在我的文件中.因为我可以找到它们.

But for these two nodes 1587438071 and 2765561213, I can make sure their are in my files. Because i can find them.

[luning@pinnacle data]$ grep 1587438071 /data/weibo/users/*
/data/weibo/users/000024_0:1587438071   琬童沛胜    浙江 杭州           http://tp4.sinaimg.cn/1587438071/50/40024579617/0   f   147 60  272     false       LV2 31  一举成名|   正常  80                      2014-02-17 04:17:38


[luning@pinnacle data]$ grep 1589699375 /data/weibo/users/*
/data/weibo/users/000010_0:1589699375   在行动Isabella 吉林          http://tp4.sinaimg.cn/1589699375/50/5633181098/0    女   297 438 4729    1981-01-17  false       LV7            2014-08-13 21:43:34                      2014-01-28 10:18:52

那么,有谁能弄清楚它将如何发生?

So, anybody who can figure it out how it would happen?

推荐答案

可能是您的节点输入文件包含未正确关闭引号的字段,其中某些行被其他行吞噬"了,有效地不导入那些节点(如果字段的对齐恰好那样结束,否则抛出异常).或者面对这些汉字,解析器可能出了点问题.

Could be that your node input file(s) contains fields that doesn't close their quotes properly, which would have some lines "eaten" by other lines, effectively not importing those nodes (if the alignment of the fields would happen to end up like that, otherwise throw exception). Or it could be something wrong with the parser in the face of these chinese characters.

是否有机会与我(解析器的主要作者和导入工具)共享您的输入数据以进行调查?

Any chance you could share you input data with me (the main author of parser and the import tool) for investigation?

这篇关于导入关系的阶段出了什么问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆