google app引擎NDB记录NDB模型中的计数 [英] google app engine NDB records counts from NDB model

查看:84
本文介绍了google app引擎NDB记录NDB模型中的计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们可以通过一次查询从Google App Engine中获得多少条记录,以便我们可以向用户显示计数,并且可以将超时限制从3秒增加到5秒

How many records we can get from google app engine from single query so that we can display count to user and is we can increase timeout limit 3 seconds to 5 seconds

推荐答案

以我的经验,ndb一次最多不能提取1000条记录.这是一个示例示例,如果我尝试在包含约500,000条记录的表上使用.count().

In my experience, ndb cannot pull more than 1000 records at a time. Here is an example of what happens if I try to use .count() on a table that contains ~500,000 records.

s~project-id> models.Transaction.query().count()
WARNING:root:suspended generator _count_async(query.py:1330) raised AssertionError()
Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/utils.py", line 160, in positional_wrapper
    return wrapped(*args, **kwds)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/query.py", line 1287, in count
    return self.count_async(limit, **q_options).get_result()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 383, in get_result
    self.check_success()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 427, in _help_tasklet_along
    value = gen.throw(exc.__class__, exc, tb)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/query.py", line 1330, in _count_async
    batch = yield rpc
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/ndb/tasklets.py", line 513, in _on_rpc_completion
    result = rpc.get_result()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/api/apiproxy_stub_map.py", line 614, in get_result
    return self.__get_result_hook(self)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/datastore/datastore_query.py", line 2910, in __query_result_hook
    self._batch_shared.conn.check_rpc_success(rpc)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/datastore/datastore_rpc.py", line 1377, in check_rpc_success
    rpc.check_success()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/api/apiproxy_stub_map.py", line 580, in check_success
    self.__rpc.CheckSuccess()
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/api/apiproxy_rpc.py", line 157, in _WaitImpl
    self.request, self.response)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/remote_api/remote_api_stub.py", line 308, in MakeSyncCall
    handler(request, response)
  File "/usr/local/Caskroom/google-cloud-sdk/latest/google-cloud-sdk/platform/google_appengine/google/appengine/ext/remote_api/remote_api_stub.py", line 362, in _Dynamic_Next
    assert next_request.offset() == 0
AssertionError

要通过此操作,您可以执行以下操作:

To by pass this, you can do something like:

objs = []
q = None
more = True
while more:
    _objs, q, more = models.Transaction.query().fetch_page(300, start_cursor=q)
    objs.extend(_objs)

但是,即使那样最终也会达到内存/超时限制.

But even that will eventually hit memory/timeout limits.

当前,我使用Google Dataflow预先计算这些值,并将结果作为模型DaySummaries& StatsPerUser

Currently I use Google Dataflow to pre-compute these values and store the results in Datastore as the models DaySummaries & StatsPerUser

snakecharmerb是正确的.我可以在生产环境中使用.count(),但是必须计算的实体越多,所需的时间就越长.这是我的日志查看器的屏幕截图,其中花了大约15秒才能计算大约330,000条记录

snakecharmerb is correct. I was able to use .count() in the production environment, but the more entities it has to count, the longer it seems to take. Here's a screenshot of my logs viewer where it took ~15 seconds to count ~330,000 records

当我尝试向该查询添加一个返回计数为〜4500的过滤器时,运行了大约一秒钟.

When I tried adding a filter to that query which returned a count of ~4500, it took about a second to run instead.

好吧,我还有另一个App Engine项目,该项目具有大约8,000,000条记录.我尝试在http请求处理程序中对此执行.count(),并且在运行60秒后请求超时.

Ok I had another app engine project with a kind with ~8,000,000 records. I tried to do .count() on that in my http request handler and the request timed-out after running for 60 seconds.

这篇关于google app引擎NDB记录NDB模型中的计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆