MongoDb 2.2、2.4 和 2.6 中的 Map-Reduce 性能 [英] Map-Reduce performance in MongoDb 2.2, 2.4, and 2.6

查看:20
本文介绍了MongoDb 2.2、2.4 和 2.6 中的 Map-Reduce 性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现了这个讨论:MongoDB:糟糕的 MapReduce 性能.基本上它说尽量避免 Mongo 的 MR 查询,因为它是单线程的,根本不应该是实时的.2 年过去了,我想知道从那时起发生了什么变化.现在我们有了 MongoDb 2.2.我听说 MR 现在是多线程的.请分享您对 MR 用于实时请求的想法,例如为 Web 应用程序频繁的 http 请求获取数据.能否有效利用索引?

I've found this discussion: MongoDB: Terrible MapReduce Performance. Basically it says try to avoid Mongo's MR queries as it single-threaded and not supposed to be for real-time at all. 2 years has passed, and I wonder what has been changed since the time. Now we have MongoDb 2.2. I heard MRs are now multi-threaded. Please share your ideas over MR usage for real-time requests like fetching data for web application frequent http requests. Is it able to effectively use indexes?

推荐答案

这是 MongoDB 中 Map/Reduce 的当前功能状态

Here is the current state of functionality for Map/Reduce in MongoDB

1) Map/Reduce 的大部分性能限制仍然存在于 MongoDB 2.2 版中.Map/Reduce 引擎仍然要求将每条记录从 BSON 转换为 JSON,使用嵌入式 JavaScript 引擎执行实际计算(速度很慢),并且仍然存在单个全局 JavaScript 锁,它只允许单个 JavaScript 线程一次运行.

1) Most of the performance limitations for Map/Reduce still remain in MongoDB version 2.2. The Map/Reduce engine still requires that every record get converted from BSON to JSON, the actual calculations are performed using the embedded JavaScript engine (which is slow), and there still is a single global JavaScript lock, which only allows a single JavaScript thread to run at a single time.

对分片集群的 Map/Reduce 进行了一些增量改进.最值得注意的是,最终的 Reduce 操作现在分布在多个分片上,并且输出也被并行分片.

There have been some incremental improvements to Map/Reduce for sharded clusters. Most notably, the final Reduce operation is now distributed across multiple shards, and the output is also sharded in parallel.

我不建议在 MongoDB 2.2 版中使用 Map/Reduce 进行实时聚合

I would not recommend Map/Reduce for real-time aggregation in MongoDB version 2.2

2) 从 MongoDB 2.2 开始,现在有一个新的聚合框架.这是聚合操作的新实现,用 C++ 编写,并紧密集成到 MongoDB 框架中.

2) Starting with MongoDB 2.2, there is now a new Aggregation Framework. This is a new implementation of aggregation operations, written in C++, and tightly integrated into the MongoDB framework.

大多数 Map/Reduce 作业都可以重写以使用聚合框架.它们通常运行速度更快(2.2 版本中常见的 Map/Reduce 速度提高了 20 倍),它们充分利用了现有的查询引擎,您可以并行运行多个聚合命令.

Most Map/Reduce jobs can be rewritten to use the Aggregation Framework. They usually run faster (20x speed improvement vs. Map/Reduce is common in version 2.2), they make full use of the existing query engine, and you can run multiple Aggregation commands in parallel.

如果您有实时聚合需求,首先要从聚合框架入手.有关聚合框架的更多信息,请查看以下链接:

If you have real-time aggregation requirements, the first place to start is with the Aggregation Framework. For more information about the aggregation framework, take a look at these links:

3) MongoDB 2.4 版中的 Map/Reduce 有了显着改进.SpiderMonkey JavaScript 引擎已被 V8 JavaScript 引擎取代,不再有全局 JavaScript 锁,这意味着多个 Map/Reduce 线程可以并发运行.

3) There have been significant improvements in Map/Reduce in MongoDB version 2.4. The SpiderMonkey JavaScript engine has been replaced by the V8 JavaScript engine, and there is no longer a global JavaScript lock, which means that multiple Map/Reduce threads can run concurrently.

Map/Reduce 引擎仍然比聚合框架慢很多,主要原因有两个:

The Map/Reduce engine is still considerably slower than the aggregation framework, for two main reasons:

  • 解释 JavaScript 引擎,而聚合框架运行编译后的 C++ 代码

  • The JavaScript engine is interpreted, while the Aggregation Framework runs compiled C++ code

JavaScript 引擎仍然要求每个被检查的文档都从 BSON 转换为 JSON;如果您将输出保存在集合中,则必须将结果集从 JSON 转换回 BSON

The JavaScript engine still requires that every document being examined get converted from BSON to JSON; if you're saving the output in a collection, the result set must then be converted from JSON back to BSON

Map/Reduce 在 2.4 和 2.6 之间没有显着变化.

There are no significant changes in Map/Reduce between 2.4 and 2.6.

我仍然不建议在 MongoDB 2.4 或 2.6 版本中使用 Map/Reduce 进行实时聚合.

I still do not recommend using the Map/Reduce for real-time aggregation in MongoDB version 2.4 or 2.6.

4) 如果你真的需要 Map/Reduce,你也可以看看 Hadoop Adaptor.这里有更多信息:

4) If you really need Map/Reduce, you can also look at the Hadoop Adaptor. There's more information here:

这篇关于MongoDb 2.2、2.4 和 2.6 中的 Map-Reduce 性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆