MongoDB查询具有不存在字段的记录&索引 [英] MongoDB Query for records with non-existant field & indexing

查看:687
本文介绍了MongoDB查询具有不存在字段的记录&索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个包含大约1M文档的mongo数据库,我们希望使用已处理的字段轮询此数据库以查找我们之前未见过的文档。为此,我们设置了一个名为 _processed 的新字段。

We have a mongo database with around 1M documents, and we want to poll this database using a processed field to find documents which we havent seen before. To do this we are setting a new field called _processed.

要查询需要处理的文档,我们查询没有此处理字段的文档:

To query for documents which need to be processed, we query for documents which do not have this processed field:

db.stocktwits.find({ "_processed" : { "$exists" : false } })

但是,此查询每次完成大约需要30秒,相当慢。有一个索引(asc)位于_processed字段:

However, this query takes around 30 seconds to complete each time, which is rather slow. There is an index (asc) which sits on the _processed field:

db.stocktwits.ensureIndex({ "_processed" : -1 },{ "name" : "idx_processed" });

添加此索引不会改变查询性能。集合上还有一些其他索引(即ID idx和每个文档中几个字段的唯一索引)。

Adding this index does not change query performance. There are a few other indexes sitting on the collection (namely the ID idx & a unique index of a couple of fields in each document).

_processed字段很长,也许应该更改为bool以使事情更快?

The _processed field is a long, perhaps this should be changed to a bool to make things quicker?

我们尝试使用$ where查询(即 $ where:this._processed == null )来执行与 $ exists相同的操作:false 并且性能大致相同(几分钟慢,这是有道理的)...

We have tried using a $where query (i.e. $where : this._processed==null) to do the same thing as $exists : false and the performance is about the same (few secs slower which makes sense)...

任何关于什么会导致缓慢性能的想法(还是正常的)?有没有人对如何提高查询速度有任何建议?

Any ideas on what would be casusing the slow performance (or is it normal)? Does anyone have any suggestions on how to improve the query speed?

干杯!

推荐答案

升级到2.0会这样做适合你:

Upgrading to 2.0 is going to do this for you:

来自MongoDB.org:

在v2.0之前,$ exists无法使用索引。其他字段的索引仍在使用。

这篇关于MongoDB查询具有不存在字段的记录&索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆