MongoDB 分页的范围查询 [英] Range query for MongoDB pagination

查看:20
本文介绍了MongoDB 分页的范围查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 MongoDB 之上实现分页.对于我的范围查询,我考虑使用 ObjectIDs:

I want to implement pagination on top of a MongoDB. For my range query, I thought about using ObjectIDs:

db.tweets.find({ _id: { $lt: maxID } }, { limit: 50 })

但是,根据文档,ObjectID 的结构意味着ObjectId 值不代表严格的插入顺序":

However, according to the docs, the structure of the ObjectID means that "ObjectId values do not represent a strict insertion order":

ObjectId 值的顺序和生成时间的关系不是一秒内严格的.如果多个系统或单个系统上的多个进程或线程在一秒内生成值;ObjectId 值不代表严格的插入顺序. 客户端之间的时钟偏差也可能导致即使对于值也不严格排序,因为客户端驱动程序生成 ObjectId 值,而不是 mongod 过程.

The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process.

然后我考虑使用时间戳查询:

I then thought about querying with a timestamp:

db.tweets.find({ created: { $lt: maxDate } }, { limit: 50 })

但是,不能保证日期是唯一的——很可能会在同一秒内创建两个文档.这意味着分页时可能会遗漏文档.

However, there is no guarantee the date will be unique — it's quite likely that two documents could be created within the same second. This means documents could be missed when paging.

是否有任何类型的范围查询可以为我提供更多稳定性?

Is there any sort of ranged query that would provide me with more stability?

推荐答案

尽管您的分页语法有误,但使用 ObjectId() 完全没有问题.你想要:

It is perfectly fine to use ObjectId() though your syntax for pagination is wrong. You want:

 db.tweets.find().limit(50).sort({"_id":-1});

这表示您希望推文按 _id 值按降序排序,并且您想要最新的 50 个.您的问题是当当前结果集发生变化时分页很棘手 - 所以而不是对下一页使用跳过,您要记下结果集中最小的 _id(第 50 个最近的 _id 值,然后使用以下内容获取下一页:

This says you want tweets sorted by _id value in descending order and you want the most recent 50. Your problem is the fact that pagination is tricky when the current result set is changing - so rather than using skip for the next page, you want to make note of the smallest _id in the result set (the 50th most recent _id value and then get the next page with:

 db.tweets.find( {_id : { "$lt" : <50th _id> } } ).limit(50).sort({"_id":-1});

这将为您提供下一条最近"的推文,而不会有新的推文打乱您的分页时间.

This will give you the next "most recent" tweets, without new incoming tweets messing up your pagination back through time.

完全不用担心_id值是否与插入顺序严格对应——99.999%就足够接近了,实际上没有人在意推文来的亚秒级首先 - 您甚至可能会注意到 Twitter 经常乱序显示推文,只是没那么严重.

There is absolutely no need to worry about whether _id value is strictly corresponding to insertion order - it will be 99.999% close enough, and no one actually cares on the sub-second level which tweet came first - you might even notice Twitter frequently displays tweets out of order, it's just not that critical.

如果它很重要,那么您将不得不使用相同的技术,但使用推文日期",该日期必须是时间戳,而不仅仅是日期.

If it is critical, then you would have to use the same technique but with "tweet date" where that date would have to be a timestamp, rather than just a date.

这篇关于MongoDB 分页的范围查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆