在mongodb中实现分页 [英] Implementing pagination in mongodb

查看:141
本文介绍了在mongodb中实现分页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道使用跳过来实现分页是不好的做法,因为当你的数据变大时跳过开始消耗大量内存。解决此问题的一种方法是使用 _id 字段的自然顺序:

I know that it is a bad practice to use skip in order to implement pagination, because when your data gets large skip starts to consume a lot of memory. One way to overcome this trouble is to use natural order by _id field:

//Page 1
db.users.find().limit(pageSize);
//Find the id of the last document in this page
last_id = ...

//Page 2
users = db.users.find({'_id'> last_id}). limit(10);

问题是 - 我是mongo的新手,不知道什么是最好的方法这非常 last_id

The problem is - I'm new to mongo and do not know what is the best way to get this very last_id

推荐答案

你所谈论的概念可称为前向寻呼。一个很好的理由不同于使用 .skip() .limit()修饰符,这不能用于返回上一页或确实跳过到特定页面。至少没有大量的努力来存储看到或发现的页面,所以如果你想要那种类型的页面链接分页,那么你最好坚持使用 .skip() .limit()方法,尽管存在性能缺陷。

The concept you are talking about can be called "forward paging". A good reason for that is unlike using .skip() and .limit() modifiers this cannot be used to "go back" to a previous page or indeed "skip" to a specific page. At least not with a great deal of effort to store "seen" or "discovered" pages, so if that type of "links to page" paging is what you want, then you are best off sticking with the .skip() and .limit() approach, despite the performance drawbacks.

如果你只是前进是一个可行的选择,那么这里是基本概念:

If it is a viable option to you to only "move forward", then here is the basic concept:

db.junk.find().limit(3)

{ "_id" : ObjectId("54c03f0c2f63310180151877"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("54c03f0c2f63310180151878"), "a" : 4, "b" : 4 }
{ "_id" : ObjectId("54c03f0c2f63310180151879"), "a" : 10, "b" : 10 }

当然这是你的第一页,限制为3项。现在用代码迭代游标来考虑:

Of course that's your first page with a limit of 3 items. Consider that now with code iterating the cursor:

var lastSeen = null;
var cursor = db.junk.find().limit(3);

while (cursor.hasNext()) {
   var doc = cursor.next();
   printjson(doc);
   if (!cursor.hasNext())
     lastSeen = doc._id;
}

因此迭代光标并执行某些操作,并且当它确实是到达游标中的最后一项,您将 lastSeen 值存储到当前 _id

So that iterates the cursor and does something, and when it is true that the last item in the cursor is reached you store the lastSeen value to the present _id:

ObjectId("54c03f0c2f63310180151879")

在您的后续迭代中,您只需输入 _id 值(您在会话中或其他任何内容中)查询该值:

In your subsequent iterations you just feed that _id value which you keep ( in session or whatever ) to the query:

var cursor = db.junk.find({ "_id": { "$gt": lastSeen } }).limit(3);

while (cursor.hasNext()) {
   var doc = cursor.next();
   printjson(doc);
   if (!cursor.hasNext())
     lastSeen = doc._id;
}

{ "_id" : ObjectId("54c03f0c2f6331018015187a"), "a" : 1, "b" : 1 }
{ "_id" : ObjectId("54c03f0c2f6331018015187b"), "a" : 6, "b" : 6 }
{ "_id" : ObjectId("54c03f0c2f6331018015187c"), "a" : 7, "b" : 7 }

并且该过程反复重复,直到无法获得更多结果。

And the process repeats over and over until no more results can be obtained.

这是自然顺序的基本过程,例如 _id 。对于其他东西,它会变得更复杂。请考虑以下事项:

That's the basic process for a natural order such as _id. For something else it gets a bit more complex. Consider the following:

{ "_id": 4, "rank": 3 }
{ "_id": 8, "rank": 3 }
{ "_id": 1, "rank": 3 }    
{ "_id": 3, "rank": 2 }

要将其拆分为按排名排序的两个页面,那么您基本上需要了解的是已经拥有的看到并排除那些结果。所以看第一页:

To split that into two pages sorted by rank then what you essentially need to know is what you have "already seen" and exclude those results. So looking at a first page:

var lastSeen = null;
var seenIds = [];
var cursor = db.junk.find().sort({ "rank": -1 }).limit(2);

while (cursor.hasNext()) {
   var doc = cursor.next();
   printjson(doc);
   if ( lastSeen != null && doc.rank != lastSeen )
       seenIds = [];
   seenIds.push(doc._id);
   if (!cursor.hasNext() || lastSeen == null)
     lastSeen = doc.rank;
}

{ "_id": 4, "rank": 3 }
{ "_id": 8, "rank": 3 }

在下一次迭代中,您希望小于或等于lastSeenrank得分,但也不包括那些已经看过的文档。您可以使用 $ nin 运算符:

On the next iteration you want to be less or equal to the lastSeen "rank" score, but also excluding those already seen documents. You do this with the $nin operator:

var cursor = db.junk.find(
    { "_id": { "$nin": seenIds }, "rank": "$lte": lastSeen }
).sort({ "rank": -1 }).limit(2);

while (cursor.hasNext()) {
   var doc = cursor.next();
   printjson(doc);
   if ( lastSeen != null && doc.rank != lastSeen )
       seenIds = [];
   seenIds.push(doc._id);
   if (!cursor.hasNext() || lastSeen == null)
     lastSeen = doc.rank;
}

{ "_id": 1, "rank": 3 }    
{ "_id": 3, "rank": 2 }

您实际持有多少seenIds取决于您的结果在该值可能发生变化的地方的细化程度。在这种情况下,您可以检查当前排名分数是否不等于 lastSeen 值,并丢弃当前的 seenIds 内容因此它不会增长太多。

How many "seenIds" you actually hold on to depends on how "granular" your results are where that value is likely to change. In this case you can check if the current "rank" score is not equal to the lastSeen value and discard the present seenIds content so it does not grow to much.

这是前向寻呼的基本概念,供您练习和学习。

That's the basic concepts of "forward paging" for you to practice and learn.

这篇关于在mongodb中实现分页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆