分批查询分页?是否有可能批量从数据存储获取并获得游标? [英] Pagination with batch queries? Is it possible to batch gets from the datastore and get a cursor?

查看:117
本文介绍了分批查询分页?是否有可能批量从数据存储获取并获得游标?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我正在从数据存储中请求20个条目,并使用游标将这些条目返回给用户,并且如果用户要求更多条目,请将光标作为新的开始并请求接下来的20个条目。



代码看起来像这样:

$ $ $ $ $ $ $ $> datastore.NewQuery(Item )。
限制(限制)

如果使用cursor,err:= datastore.DecodeCursor(cursor); err == nil {
q = q.Start(cursor)
}

var is [] Item
t:= q.Run(c)
for {
var i Item
_,err:= t.Next(& i)
if err == datastore.Done {
break
}

is = append(is,i)
}

这是完整的代码很重要: https:// github。 com / koffeinsource / kaffeeshare / blob / master / data / appengine.go#L23



使用<$ code> append ,但我没有看到使用 GetMulti / GetAll 或者我错过了什么?



我希望在用户查询数据存储时添加数据,因此偏移可能会产生重复结果。在这种情况下,我应该关心批量处理吗?

解决方案

您的方法非常好,实际上它是最好的方法AppEngine。

通过设置开始游标来查询后续实体,如果插入新记录(例如第一个记录),则不会给您重复结果。



为什么?因为游标包含最后一个返回实体的关键字,而不是以前返回的实体的数量。

所以如果你设置一个游标,数据存储将开始列出并返回在光标中编码的键之后的实体。如果在光标后面保存了一个新实体,那么该实体将在到达时返回。



同样使用作为 append()是最好的方法。您可以预先创建一个足够大的切片来优化它:

  var is = make([] Item,0,limit )

但请注意,我使用 0 长度和限制目的容量:不能保证有足够的实体填满整个切片。



另一个优化是将其分配为限制长度:

  var is = make([] Item,limit)

以及 datastore.Done 时,如果未完全填充,请重新提示它,例如:

 为idx:= 0; ; idx ++ {
var i Item
_,err:= t.Next(& i)
if err == datastore.Done {
if idx< len(is){
is = is [:idx] // Reslice,因为它没有完全填充
}
break
}

是[ idx] = i
}



批量操作




GetMulti PutMulti DeleteMulti 获取放置删除 code>函数。他们使用 [] *键而不是 *键,并可能返回 appengine .MultiError 遇到部分失败时。


批量操作不是查询的替代或替代。例如, GetMulti 要求您已经准备好了要为其获取完整实体的所有密钥。因此,这些批处理操作没有光标感。

批处理操作会返回所有请求的信息(或执行所有请求的操作)。没有一系列的实体或操作可能会/可能会被终止,并在以后继续。



查询和批处理操作适用于不同的事情。你不应该担心查询和游标性能。他们做得很好,重要的是,他们(Datastore)的规模很好。游标不会减慢查询的执行速度,使用游标的查询的运行速度与不使用游标的查询速度一样快,而且之前返回的实体也不会影响查询执行时间:如果运行查询,则无关紧要没有一个游标或你获得一百万个实体后获得的游标(只有几次迭代才可能获得)。


I am currently requesting 20 entries from the datastore, return these to the user with a cursor and in case the user is asking for more entries use the cursor as a new start and ask for the next 20 entries.

The code looks something like

q := datastore.NewQuery("Item").
    Limit(limit)

if cursor, err := datastore.DecodeCursor(cursor); err == nil {
    q = q.Start(cursor)
}

var is []Item
t := q.Run(c)
for {
    var i Item
    _, err := t.Next(&i)
    if err == datastore.Done {
        break
    }

    is = append(is, i)
}

In case it is important here is the complete code: https://github.com/koffeinsource/kaffeeshare/blob/master/data/appengine.go#L23

It looks an anti-pattern to use a loop with an append, but I don't see a way to get a cursor when using GetMulti/GetAll or am I missing something?

I do expect a data being added while users are querying the datastore, so an offset may produce duplicate results. Should I care about batching gets in this case?

解决方案

Your approach is perfectly fine, in fact, it is the best way on AppEngine.

Querying subsequent entities by setting a start cursor will not give you duplicate results if a new record is inserted which would be the first for example.

Why? Because the cursor contains the key of the last returned entity encoded, and not the number of previously returned entities.

So if you set a cursor, the datastore will start listing and returning entities that come after the key encoded in the cursor. If a new entity is saved that comes after the cursor, then that entity will be returned when reached.

Also using the for and append() is the best way. You might optimize it a little by creating a big enough slice beforehand:

var is = make([]Item, 0, limit)

But note that I made it with 0 length and limit capacity on purpose: there is no guarantee that there will be enough entities to fill the full slice.

Another optimization would be to allocate it to be limit length:

var is = make([]Item, limit)

and when datastore.Done is reached, reslice it if it is not filled fully, for example:

for idx := 0; ; idx++ {
    var i Item
    _, err := t.Next(&i)
    if err == datastore.Done {
        if idx < len(is) {
            is = is[:idx] // Reslice as it is not filled fully
        }
        break
    }

    is[idx] = i
}

Batch operations

GetMulti, PutMulti and DeleteMulti are batch versions of the Get, Put and Delete functions. They take a []*Key instead of a *Key, and may return an appengine.MultiError when encountering partial failure.

Batch operations are not a replacement or alternative to queries. GetMulti for example requires you to already have all the keys prepared for which you want to get the complete entities. And as such, there is no sense of a cursor for these batch operations.

Batch operations return you all the requested information (or do all the requested operation). There is no sequence of entities or operations which would/could be terminated and continued later on.

Queries and batch operations are for different things. You shouldn't worry about query and cursor performance. They do quite good, and what's important, they (the Datastore) scale good. A cursor will not slow the execution of a query, a query with a cursor will run just as fast as a query without a cursor, and also previously returned entities do not affect query execution time: it doesn't matter if you run a query without a cursor or with a cursor which you acquired after getting a million entities (which is only possible with several iterations).

这篇关于分批查询分页?是否有可能批量从数据存储获取并获得游标?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆