从Mongo随机抽样 [英] Random Sampling from Mongo

查看:1203
本文介绍了从Mongo随机抽样的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个mongo集合与文档。每个文档中有一个字段为0或1.我需要从数据库随机抽取1000条记录,并将具有该字段的文档数计为1.我需要做这个抽样1000次。

I have a mongo collection with documents. There is one field in every document which is 0 OR 1. I need to random sample 1000 records from the database and count the number of documents who have that field as 1. I need to do this sampling 1000 times. How do i do it ?

推荐答案

对于MongoDB 3.0和之前的版本,我使用了一个SQL的老技巧使用他们的随机页功能)。我在每个需要随机化的对象中存储0和1之间的随机数,我们称之为r。然后在r上添加索引。

For MongoDB 3.0 and before, I use an old trick from SQL days (which I think Wikipedia use for their random page feature). I store a random number between 0 and 1 in every object I need to randomize, let's call that field "r". You then add an index on "r".

db.coll.ensureIndex(r: 1);

现在要获取随机x对象,您可以使用:

Now to get random x objects, you use:

var startVal = Math.random();
db.coll.find({r: {$gt: startVal}}).sort({r: 1}).limit(x);

这将在一个查询查询中提供随机对象。根据你的需要,这可能是过度的,但如果你要做大量的抽样,随着时间的推移,这是一个非常有效的方式,而不加载在你的后端。

This gives you random objects in a single find query. Depending on your needs, this may be overkill, but if you are going to be doing lots of sampling over time, this is a very efficient way without putting load on your backend.

这篇关于从Mongo随机抽样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆