我应该使用"allowDiskUse"还是"allowDiskUse"?产品环境中的选择? [英] Should I use the "allowDiskUse" option in a product environment?

查看:484
本文介绍了我应该使用"allowDiskUse"还是"allowDiskUse"?产品环境中的选择?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当返回的文档聚合超过16MB限制时,我应该使用allowDiskUse选项吗?

Should I use the allowDiskUse option when returned doc exceed 16MB limit in aggregation?

还是应该更改数据库结构或代码逻辑来避免限制? "allowDiskUse"的优缺点是什么? 感谢您的帮助.

Or should I alter db structure or codes logic to avoid the limit? What's the advantage and disadvantage of 'allowDiskUse'? Thanks for your help.

她是我看过的官方文件: 结果大小限制

Hers is the official doc I have seen: Result Size Restrictions

在2.6版中进行了更改.

Changed in version 2.6.

从MongoDB 2.6开始,aggregate命令可以返回游标或将结果存储在集合中.返回游标或将结果存储在集合中时,结果集中的每个文档都受BSON文档大小限制(当前为16兆字节);如果任何单个文档超出BSON文档大小限制,该命令将产生错误.该限制仅适用于退回的文件;在管道处理过程中,文档可能会超过此大小.

Starting in MongoDB 2.6, the aggregate command can return a cursor or store the results in a collection. When returning a cursor or storing the results in a collection, each document in the result set is subject to the BSON Document Size limit, currently 16 megabytes; if any single document that exceeds the BSON Document Size limit, the command will produce an error. The limit only applies to the returned documents; during the pipeline processing, the documents may exceed this size.

内存限制¶

在2.6版中进行了更改.

Changed in version 2.6.

管道阶段具有100 MB的RAM限制.如果阶段超出此限制,则MongoDB将产生错误.要允许处理大型数据集,请使用allowDiskUse选项启用聚合管道阶段以将数据写入临时文件. https://docs.mongodb.com/manual/core/aggregation-pipeline-限制/

Pipeline stages have a limit of 100 megabytes of RAM. If a stage exceeds this limit, MongoDB will produce an error. To allow for the handling of large datasets, use the allowDiskUse option to enable aggregation pipeline stages to write data to temporary files. https://docs.mongodb.com/manual/core/aggregation-pipeline-limits/

推荐答案

allowDiskUse与16MB结果大小限制无关.该设置控制管道步骤(例如$ sort或$ group)是否需要一些临时磁盘空间(如果它们需要超过100MB的内存).从理论上讲,对于任意管道,这可能是非常大量的磁盘空间.就个人而言,这从来都不是问题,但这将取决于您的数据.

allowDiskUse is unrelated to the 16MB result size limit. That setting controls whether pipeline steps such as $sort or $group can use some temporary disk space if they need more than 100MB of memory. In theory, for an arbitrary pipeline this could be a very large amount of diskspace. Personally it's never been a problem, but that will be down to your data.

如果结果将超过16MB,则需要使用$ out流水线阶段将数据输出到集合中,或者使用流水线API将游标返回到结果,而不是直接返回所有数据(对于对于某些驱动程序,这是一个单独的方法,对于其他驱动程序,这是一个传递给同一方法的标志.

If your result is going to be more than 16MB then you need to use the $out pipeline stage to output the data to a collection or use a pipeline API that returns a cursor to results instead of returning all the data inline (for some drivers this is a separate method, for others it is a flag passed to the same method).

这篇关于我应该使用"allowDiskUse"还是"allowDiskUse"?产品环境中的选择?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆