如何同时使用$ gt和$ lte优化MongoDB查询? [英] How to optimize MongoDB query with both $gt and $lte?
问题描述
我有以下查询,有点像反向范围查询:
I have the following query that is kind of like a reverse range lookup:
db.ip_ranges.find({ $and: [{ start_ip_num: { $lte: 1204135028 } }, { end_ip_num: { $gt: 1204135028 } }] })
仅使用$ lte标识符运行时,查询立即返回.但是当我在同一查询中同时运行$ gt和$ lte时,它非常慢(以秒为单位).
When run with only the $lte identifier, the query returns right away. But when I run with both the $gt and $lte in the same query, it is extremely slow (in seconds).
start_ip_num和end_ip_num字段均已建立索引.
Both the start_ip_num and end_ip_num fields are indexed.
如何优化此查询?
编辑
在查询上使用explain()函数时,我得到以下信息:
I get the following when I use the explain() function on the query:
{
"cursor" : "BtreeCursor start_ip_num_1",
"nscanned" : 452336,
"nscannedObjects" : 452336,
"n" : 1,
"millis" : 2218,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"start_ip_num" : [
[
-1.7976931348623157e+308,
1204135028
]
]
}
}
编辑2
添加复合索引后,explain()函数将返回以下内容:
Once I added the compound index, the explain() function returns the following:
{
"cursor" : "BtreeCursor start_ip_num_1_end_ip_num_1",
"nscanned" : 431776,
"nscannedObjects" : 1,
"n" : 1,
"millis" : 3433,
"nYields" : 0,
"nChunkSkips" : 0,
"isMultiKey" : false,
"indexOnly" : false,
"indexBounds" : {
"start_ip_num" : [
[
-1.7976931348623157e+308,
1204135028
]
],
"end_ip_num" : [
[
1204135028,
1.7976931348623157e+308
]
]
}
}
但是,性能仍然很差(以秒为单位).
However, the perf is still poor (in seconds).
推荐答案
因此,在Mongo中不建议使用双范围查询.我假设您有一个包含两个{start_ip_num: 1, end_ip_num: 1}
的索引.
So, double range queries are ill-advised in Mongo. I assume you have a single index containing both {start_ip_num: 1, end_ip_num: 1}
.
如果这样还不能使您足够接近(通常,如果第一个字段返回的数据足够,它仍然很慢,因为它必须进行大量的B树扫描),您可以采取以下一种技巧使用2D框查询来解决此问题(一次仅适用于两个范围).
If that doesn't get you close enough (often it is still slow if you have enough data returned by the first field, since it has to do a lot of B-tree scanning), there is one trick you can do to combat this using 2D box queries (only works for two ranges at a time).
基本上,您将二维地理索引放在包含数组中两个点的字段上,例如[start_ip,end_ip],并为其提供足够高的最小/最大值,以使其不会达到默认情况下为-180/180.
Basically, you put a 2D geo index on a field containing the two points in an array, like [start_ip, end_ip], and give it a high enough min/max value so that it won't hit the limits which are by default just -180/180.
最后,使用边界查询,范围在框的一个角上从min到$ lte值,在框的另一个角上用gt和max值.请参见 http://www.mongodb.org/display/DOCS/Geospatial+ Indexing#GeospatialIndexing-BoundsQueries 用于语法.
Finally, use a bounds query with the range going from min to the $lte value on one corner of the box, and the gt and the max value on the other corner of the box. See http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-BoundsQueries for syntax.
它看起来像这样:
db.ip_ranges.find({ip_range:{$within:{$box:[[0, 1204135028], [1204135028, max]]}}});
其中max是您可以拥有的最大IP.
where max is the biggest ip you can have.
已经有一段时间了,因为我已经看过这个了,所以框可能是错的,但是这个概念很合理,这使得双范围查询的性能比常规的两字段B树索引更好.持续不到一秒钟(虽然通常是几百毫秒),而使用常规索引则要花费几秒钟-我认为我当时有数亿文档,但是已经有一段时间了,因此请认真地使用这些记住的基准盐.我敢肯定,结果会因您的数据和范围大小而有很大差异.
It's been a while since I've looked a this, so the box might be wrong, but the concept is sound, and it made the double range queries perform a bit better than with a regular two field B-tree index. Consistently under a second (although usually a few hundred ms), compared to a few seconds with the regular index--I think I had hundreds of millions of docs at the time, but it's been a while so take these remembered benchmarks with a grain of salt. Results will vary greatly depending on your data and range sizes, I'm sure.
更新:您可能想尝试使用bits
设置,尝试使用低数字和高数字以查看是否有所不同.对我来说,它似乎并没有平均影响查询.参见 http://www.mongodb.org/display/DOCS/Geospatial+ Indexing#GeospatialIndexing-CreatingtheIndex 获取语法.
Update: You might want to experiment with the bits
setting, trying a low number and a high number to see if it makes a difference. For me, it didn't seem to affect the queries on average. See http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-CreatingtheIndex for syntax.
这篇关于如何同时使用$ gt和$ lte优化MongoDB查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!