MongoDB,通过正则表达式对索引字段进行查询的性能 [英] MongoDB, performance of query by regular expression on indexed fields
问题描述
我想按名称查找帐户(在 50K 帐户的 MongoDB 集合中)
通常的方式:我们用字符串查找
db.accounts.find({ name: 'Jon Skeet' }) // indexes help improve performance!
正则表达式怎么样?手术费用高吗?
db.accounts.find( { name: /Jon Skeet/ }) // worry! how indexes work with regex?
<小时>
根据 WiredPrairie:
MongoDB 使用 RegEx 的 prefix 来查找索引(例如:/^prefix.*/
):
According to WiredPrairie:
MongoDB use prefix of RegEx to lookup indexes (ex: /^prefix.*/
):
db.accounts.find( { name: /^Jon Skeet/ }) // indexes will help!'
推荐答案
其实根据文档,
如果该字段存在索引,则 MongoDB 匹配正则针对索引中的值的表达式,这可能比 a集合扫描.如果常规的可以进行进一步的优化表达式是一个前缀表达式",这意味着所有潜在的匹配以相同的字符串开始.这允许 MongoDB 构建一个来自该前缀的范围",并且只匹配来自位于该范围内的索引.
If an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a "prefix expression", which means that all potential matches start with the same string. This allows MongoDB to construct a "range" from that prefix and only match against those values from the index that fall within that range.
http://docs.mongodb.org/手册/参考/操作员/查询/正则表达式/#index-use
换句话说:
对于 /Jon Skeet/
regex ,mongo 将完整扫描索引中的键,然后获取匹配的文档,这比集合扫描更快.
For /Jon Skeet/
regex ,mongo will full scan the keys in the index then will fetch the matched documents, which can be faster than collection scan.
对于 /^Jon Skeet/
regex ,mongo 将只扫描索引中以正则表达式开头的范围,这样会更快.
For /^Jon Skeet/
regex ,mongo will scan only the range that start with the regex in the index, which will be faster.
这篇关于MongoDB,通过正则表达式对索引字段进行查询的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!