MongoDB:导入大文件时,mongoimport失去连接 [英] MongoDB: mongoimport loses connection when importing big files

查看:276
本文介绍了MongoDB:导入大文件时,mongoimport失去连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将JSON文件导入到本地MongoDB实例时遇到一些麻烦. JSON是使用mongoexport生成的,看起来像这样.没有数组,没有硬核嵌套:

I have some trouble importing a JSON file to a local MongoDB instance. The JSON was generated using mongoexport and looks like this. No arrays, no hardcore nesting:

{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"mail@mail.com","type":"answer"}
{"_created":{"$date":"2015-10-20T12:46:25.000Z"},"_etag":"7fab35685eea8d8097656092961d3a9cfe46ffbc","_id":{"$oid":"562637a14e0c9836e0821a5e"},"_updated":{"$date":"2015-10-20T12:46:25.000Z"},"body":"base64 encoded string","sender":"mail@mail.com","type":"answer"}

如果我导入一个约有300行的9MB文件,则没有问题:

If I import a 9MB file with ~300 rows, there is no problem:

[stekhn latest]$ mongoimport -d mietscraping -c mails mails-small.json 
2015-11-02T10:03:11.353+0100    connected to: localhost
2015-11-02T10:03:11.372+0100    imported 240 documents

但是,如果尝试导入具有〜1300行的32MB文件,则导入将失败:

But if try to import a 32MB file with ~1300 rows, the import fails:

[stekhn latest]$ mongoimport -d mietscraping -c mails mails.json 
2015-11-02T10:05:25.228+0100    connected to: localhost
2015-11-02T10:05:25.735+0100    error inserting documents: lost connection to server
2015-11-02T10:05:25.735+0100    Failed: lost connection to server
2015-11-02T10:05:25.735+0100    imported 0 documents

这是日志:

2015-11-02T11:53:04.146+0100 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:45237 #21 (6 connections now open)
2015-11-02T11:53:04.532+0100 I -        [conn21] Assertion: 10334:BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"
2015-11-02T11:53:04.536+0100 I NETWORK  [conn21] AssertionException handling request, closing client connection: 10334 BSONObj size: 23592351 (0x167FD9F) is invalid. Size must be between 0 and 16793600(16MB) First element: insert: "mails"

我以前听说过 BSON文档的16MB限制,但由于JSON中没有行文件大于16MB,这应该不是问题,对吧?当我执行完全相同的操作(32MB)导入本地计算机时,一切正常.

I've heard about the 16MB limit for BSON documents before, but since no row in my JSON file is bigger than 16MB, this shouldn't be a problem, right? When I do the exact same (32MB) import one my local computer, everything works fine.

任何想法会导致这种奇怪的行为吗?

Any ideas what could cause this weird behaviour?

推荐答案

我想问题出在性能上,您可以通过以下任何方式解决此问题:

I guess the problem is about performance, any way you can solved used:

您可以使用 mongoimport选项-j .如果不适用于4,则尝试增量操作,即4,8,16,具体取决于您CPU中的内核数.

you can use mongoimport option -j. Try increment if not work with 4. i.e, 4,8,16, depend of the number of core you have in your cpu.

mongoimport-帮助

mongoimport --help

-j,--numInsertionWorkers =要运行的插入操作数 同时(默认为1)

-j, --numInsertionWorkers= number of insert operations to run concurrently (defaults to 1)

mongoimport -d mietscraping -c mails -j 4< mails.json

,也可以拆分文件并导入所有文件.

希望这对您有所帮助.

多看一点,在某些版本中是一个错误 https://jira.mongodb.org/browse/TOOLS-939 在这里,您可以更改batchSize的另一种解决方案,默认值为10000,减小该值并进行测试:

looking a little more, is a bug in some version https://jira.mongodb.org/browse/TOOLS-939 here another solution you can change the batchSize, for default is 10000, reduce the value and test:

mongoimport -d mietscraping -c邮件< mails.json --batchSize 1

这篇关于MongoDB:导入大文件时,mongoimport失去连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆