为什么 MongoDB *client* 在这种情况下使用比服务器更多的内存? [英] Why does MongoDB *client* use more memory than the server in this case?

查看:41
本文介绍了为什么 MongoDB *client* 在这种情况下使用比服务器更多的内存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在评估 MongoDB.我有一个小的 20GB 文档子集.每个本质上都是社交游戏的请求日志,以及用户当时正在玩的游戏的一些捕获状态.

I'm evaluating MongoDB. I have a small 20GB subset of documents. Each is essentially a request log for a social game along with some captured state of the game the user was playing at that moment.

我想我会尝试寻找游戏作弊者.所以我写了一个运行服务器端的函数.它在索引集合上调用 find() 并根据现有索引进行排序.使用游标,它按索引顺序遍历所有文档.索引是 {user_id,time}.因此,我正在查看每个用户的历史记录,检查某些值(金钱/健康/等)的增长速度是否比游戏中的增长速度更快.该脚本返回发现的第一个违规.它不收集违规行为.

I thought I'd try finding game cheaters. So I wrote a function that runs server side. It calls find() on an indexed collection and sorts according to the existing index. Using a cursor it goes through all documents in indexed order. The index is {user_id,time}. So I'm going through each user's history, checking if certain values (money/health/etc) increase faster than is possible in the game. The script returns the first violation found. It does not collect violations.

该脚本在客户端所做的唯一事情是定义函数并在另一个盒子上的 mongod 实例上调用 mymongodb.eval(myscript).

The ONLY thing that this script does on the client is define the function and calls mymongodb.eval(myscript) on a mongod instance on another box.

运行 mongod 的盒子运行良好.启动脚本的那个开始丢失内存和交换.几小时后:客户端机器上使用了 8GB 的​​ RAM 和 6GB 的交换空间,这只是在另一个机器上启动脚本并等待返回值.

The box that mongod is running on does fine. The one that the script is launched from starts losing memory and swap. Hours later: 8GB of RAM and 6GB of swap are being used on the client machine that did nothing more than launch a script on another box and wait for a return value.

mongo 客户端真的那么古怪吗?我是否做错了什么或对 mongo/mongod 做出了错误的假设?

Is the mongo client really that flakey? Have I done something wrong or made an incorrect assumption about mongo/mongod?

推荐答案

来自 文档:

对于长时间运行的作业,使用 map/reduce 而不是 db.eval().db.eval 阻塞其他操作!

Use map/reduce instead of db.eval() for long running jobs. db.eval blocks other operations!

eval 是一个函数,如果您不使用特殊标志,它会阻塞整个服务器.再次,来自文档:

eval is a function that blocks the entire server if you don't use a special flag. Again, from the docs:

如果不使用nolock"标志,db.eval() 在运行 [...] 时会阻塞整个 mongod 进程

If you don't use the "nolock" flag, db.eval() blocks the entire mongod process while running [...]

你在这里有点滥用 MongoDB.您当前的例程很奇怪,因为它返回发现的第一个违规行为,但下次运行时必须重新检查所有内容(除非您的用户 ID 已排序并且您存储了最后评估的用户 ID).

You are kind of abusing MongoDB here. Your current routine is strange, because it returns the first violation found, but it will have to re-check everything when run the next time (unless your user ids are ordered and you store the last evaluated user id).

Map/Reduce 通常是长时间运行任务的更好选择,但聚合数据似乎并不简单.然而,基于 map/reduce 的解决方案也可以解决重新评估问题.

Map/Reduce generally is the better option for a long-running task, but aggregating your data does not seem trivial. However, a map/reduce based solution would also solve the re-evaluation problem.

我可能会从 map/reduce 返回这样的东西:

I'd probably return something like this from map/reduce:

user id -> suspicious actions, e.g.
------
2525454 -> [{logId: 235345435, t: ISODate("...")}]

这篇关于为什么 MongoDB *client* 在这种情况下使用比服务器更多的内存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆