在 mongodb 中实现分页 [英] Implementing pagination in mongodb

查看:46
本文介绍了在 mongodb 中实现分页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道使用 skip 来实现分页是一种不好的做法,因为当您的数据变大时 skip 开始消耗大量内存.克服这个麻烦的一种方法是使用 _id 字段的自然顺序:

//第1页db.users.find().limit(pageSize);//查找本页最后一个文档的idlast_id = ...//第2页用户 = db.users.find({'_id'> last_id}).限制(10);

问题是 - 我是 mongo 的新手,不知道获得这个 last_id

的最佳方法是什么

解决方案

你所说的概念可以称为前向分页".一个很好的理由是不同于使用 .skip().limit() 修饰符,这不能用于返回"到上一页或实际上跳过"到特定页面.至少不需要花费大量精力来存储见过的"或发现的"页面,所以如果这种类型的页面链接"分页是你想要的,那么你最好坚持使用 .skip().limit() 方法,尽管存在性能缺陷.

如果对您来说只前进"是一个可行的选择,那么这里是基本概念:

db.junk.find().limit(3){_id":ObjectId(54c03f0c2f63310180151877"),a":1,b":1}{_id":ObjectId(54c03f0c2f63310180151878"),a":4,b":4}{_id":ObjectId(54c03f0c2f63310180151879"),a":10,b":10}

当然,这是您的第一页,限制为 3 个项目.现在考虑迭代游标的代码:

var lastSeen = null;var cursor = db.junk.find().limit(3);而 (cursor.hasNext()) {var doc = cursor.next();打印json(文档);如果 (!cursor.hasNext())lastSeen = doc._id;}

这样就迭代游标并做一些事情,当到达游标中的最后一项为真时,您将 lastSeen 值存储到当前的 _id 中:

ObjectId("54c03f0c2f63310180151879")

在您随后的迭代中,您只需将您保留的 _id 值(在会话中或其他任何内容中)提供给查询:

var cursor = db.junk.find({ "_id": { "$gt": lastSeen } }).limit(3);而 (cursor.hasNext()) {var doc = cursor.next();打印json(文档);如果 (!cursor.hasNext())lastSeen = doc._id;}{_id":ObjectId(54c03f0c2f6331018015187a"),a":1,b":1}{_id":ObjectId(54c03f0c2f6331018015187b"),a":6,b":6}{_id":ObjectId(54c03f0c2f6331018015187c"),a":7,b":7}

这个过程一遍又一遍地重复,直到无法获得更多的结果.

这是_id 等自然顺序的基本过程.对于其他事情,它变得有点复杂.考虑以下几点:

{ "_id": 4, "rank": 3 }{ "_id": 8, "rank": 3 }{ "_id": 1, "rank": 3 }{ "_id": 3, "rank": 2 }

要将其分成按排名排序的两页,那么您基本上需要知道的是您已经看到"的内容并排除这些结果.所以看第一页:

var lastSeen = null;var seeIds = [];var cursor = db.junk.find().sort({ "rank": -1 }).limit(2);而 (cursor.hasNext()) {var doc = cursor.next();打印json(文档);if ( lastSeen != null && doc.rank != lastSeen )seeIds = [];seeIds.push(doc._id);if (!cursor.hasNext() || lastSeen == null)lastSeen = doc.rank;}{ "_id": 4, "rank": 3 }{ "_id": 8, "rank": 3 }

在下一次迭代中,您希望小于或等于 lastSeen排名"分数,但也要排除那些已经看过的文档.您可以使用 $nin 运算符:

var cursor = db.junk.find({ "_id": { "$nin": seenIds }, "rank": "$lte": lastSeen }).sort({ "rank": -1 }).limit(2);而 (cursor.hasNext()) {var doc = cursor.next();打印json(文档);if ( lastSeen != null && doc.rank != lastSeen )seeIds = [];seeIds.push(doc._id);if (!cursor.hasNext() || lastSeen == null)lastSeen = doc.rank;}{ "_id": 1, "rank": 3 }{ "_id": 3, "rank": 2 }

您实际持有多少seenId"取决于您的结果在该值可能发生变化的地方的粒度".在这种情况下,您可以检查当前的排名"分数是否不等于 lastSeen 值并丢弃当前的 seenIds 内容,使其不会增长太多.>

以上就是前向分页"的基本概念,供大家练习和学习.

I know that it is a bad practice to use skip in order to implement pagination, because when your data gets large skip starts to consume a lot of memory. One way to overcome this trouble is to use natural order by _id field:

//Page 1
db.users.find().limit(pageSize);
//Find the id of the last document in this page
last_id = ...

//Page 2
users = db.users.find({'_id'> last_id}). limit(10);

The problem is - I'm new to mongo and do not know what is the best way to get this very last_id

解决方案

The concept you are talking about can be called "forward paging". A good reason for that is unlike using .skip() and .limit() modifiers this cannot be used to "go back" to a previous page or indeed "skip" to a specific page. At least not with a great deal of effort to store "seen" or "discovered" pages, so if that type of "links to page" paging is what you want, then you are best off sticking with the .skip() and .limit() approach, despite the performance drawbacks.

If it is a viable option to you to only "move forward", then here is the basic concept:

db.junk.find().limit(3)

{ "_id" : ObjectId("54c03f0c2f63310180151877"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("54c03f0c2f63310180151878"), "a" : 4, "b" : 4 }
{ "_id" : ObjectId("54c03f0c2f63310180151879"), "a" : 10, "b" : 10 }

Of course that's your first page with a limit of 3 items. Consider that now with code iterating the cursor:

var lastSeen = null;
var cursor = db.junk.find().limit(3);

while (cursor.hasNext()) {
   var doc = cursor.next();
   printjson(doc);
   if (!cursor.hasNext())
     lastSeen = doc._id;
}

So that iterates the cursor and does something, and when it is true that the last item in the cursor is reached you store the lastSeen value to the present _id:

ObjectId("54c03f0c2f63310180151879")

In your subsequent iterations you just feed that _id value which you keep ( in session or whatever ) to the query:

var cursor = db.junk.find({ "_id": { "$gt": lastSeen } }).limit(3);

while (cursor.hasNext()) {
   var doc = cursor.next();
   printjson(doc);
   if (!cursor.hasNext())
     lastSeen = doc._id;
}

{ "_id" : ObjectId("54c03f0c2f6331018015187a"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("54c03f0c2f6331018015187b"), "a" : 6, "b" : 6 }
{ "_id" : ObjectId("54c03f0c2f6331018015187c"), "a" : 7, "b" : 7 }

And the process repeats over and over until no more results can be obtained.

That's the basic process for a natural order such as _id. For something else it gets a bit more complex. Consider the following:

{ "_id": 4, "rank": 3 }
{ "_id": 8, "rank": 3 }
{ "_id": 1, "rank": 3 }    
{ "_id": 3, "rank": 2 }

To split that into two pages sorted by rank then what you essentially need to know is what you have "already seen" and exclude those results. So looking at a first page:

var lastSeen = null;
var seenIds = [];
var cursor = db.junk.find().sort({ "rank": -1 }).limit(2);

while (cursor.hasNext()) {
   var doc = cursor.next();
   printjson(doc);
   if ( lastSeen != null && doc.rank != lastSeen )
       seenIds = [];
   seenIds.push(doc._id);
   if (!cursor.hasNext() || lastSeen == null)
     lastSeen = doc.rank;
}

{ "_id": 4, "rank": 3 }
{ "_id": 8, "rank": 3 }

On the next iteration you want to be less or equal to the lastSeen "rank" score, but also excluding those already seen documents. You do this with the $nin operator:

var cursor = db.junk.find(
    { "_id": { "$nin": seenIds }, "rank": "$lte": lastSeen }
).sort({ "rank": -1 }).limit(2);

while (cursor.hasNext()) {
   var doc = cursor.next();
   printjson(doc);
   if ( lastSeen != null && doc.rank != lastSeen )
       seenIds = [];
   seenIds.push(doc._id);
   if (!cursor.hasNext() || lastSeen == null)
     lastSeen = doc.rank;
}

{ "_id": 1, "rank": 3 }    
{ "_id": 3, "rank": 2 }

How many "seenIds" you actually hold on to depends on how "granular" your results are where that value is likely to change. In this case you can check if the current "rank" score is not equal to the lastSeen value and discard the present seenIds content so it does not grow to much.

That's the basic concepts of "forward paging" for you to practice and learn.

这篇关于在 mongodb 中实现分页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆