对于Google App Engine(java),如何在FetchOptions中设置和使用块大小? [英] For Google App Engine (java), how do I set and use chunk size in FetchOptions?

查看:143
本文介绍了对于Google App Engine(java),如何在FetchOptions中设置和使用块大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一个查询,它现在正在返回1400个结果,因此我在日志文件中得到了以下警告:

Im running a query and it is currently returning 1400 results and because of this I am getting the following warning in the log file:


com.google.appengine.api.datastore.QueryResultsSourceImpl
logChunkSizeWarning:此查询没有在
FetchOptions中设置的块大小,并且返回了超过1000个结果。如果此查询的结果集
常见,请考虑将块大小设置为
以提高性能。

com.google.appengine.api.datastore.QueryResultsSourceImpl logChunkSizeWarning: This query does not have a chunk size set in FetchOptions and has returned over 1000 results. If result sets of this size are common for this query, consider setting a chunk size to improve performance.

我无法在任何地方找到任何实例来说明如何实际实现这一点,这里有一个关于python的问题,但是因为使用java的Im并不理解python,所以我正在努力翻译它。

I can't find any examples anywhere as to how to actually implement this, there is a question on here about python, but as Im using java and dont understand python, I am struggling to translate it.

另外这个查询(下面)执行17226cpu_ms,这感觉太长了,我甚至不能想象如果我有5000个联系人并需要在客户机上搜索它们会发生什么(例如你使用googlemail联系人!)

Also this query (below) is taking 17226cpu_ms to execute, which feels like way too long, I cant even imagine what would happen if I had say 5000 contacts and needed to search through them on the client side (like you do with googlemail contacts!)

我的代码是:

The code I have is:

    int index=0;
    int numcontacts=0;
    String[][] DetailList;

    PersistenceManager pm = PMF.get().getPersistenceManager();


    try {
        Query query = pm.newQuery(Contact.class, "AdminID == AID");
        query.declareParameters("Long AID");
        query.setOrdering("Name asc");
        List<Contact> Contacts = (List<Contact>) query.execute(AdminID);
        numcontacts=Contacts.size();
        DetailList=new String[numcontacts][5];

        for (Contact contact : Contacts) 
        {
            DetailList[index][0]=contact.getID().toString();
            DetailList[index][1]=Encode.EncodeString(contact.getName());
            index++;
        }
    } finally {
        pm.close();
    }
    return (DetailList);

我在这里找到以下两项:

I found the following two entries on here:

  • google app engine chunkSize & prefetchSize - where can I read details on it?
  • GAE/J Low-level API: FetchOptions usage

但实际上都没有进入有关如何实施或使用这些选项的详细信息。
我猜测它是一个服务器端进程,我猜你是想设置某种循环来一次抓取块,但我该怎么做呢?

but neither actually goes into any details about how to implement or use these options. Im guessing its a server side process, and Im guessing that you are meant to setup some kind of loop to grab the chunks one chunk at a time, but how do I actually do that?


  • 我是否在循环中调用查询?

  • 如何知道循环次数?

  • 我是否会检查返回的第一个块是否小于块大小的条目数?
  • Do I call the query inside a loop?
  • How do I know how many times to loop?
  • Do I just check for the first chunk that comes back with less than the chunk size number of entries?

如果没有一个实际的例子,我该如何去寻找这样的东西?
在我看来,这里的其他人似乎只是知道该怎么做......!

How am I meant to go about finding out stuff like this without an actual example to follow? It seems to me that other people on here seem to "just know" how to do it..!

对不起,如果我不是在问问题正确的方式,或者我只是一个模糊的新手关于这个,但我不知道还有什么可以找出这个!!

Sorry If I am not asking the questions in the right way or I'm just being a dim newbie about this, but I dont know where else to turn to figure this out!

推荐答案

遇到同样的问题,最后的评论是从一个月前开始的,所以这里是我发现的关于重型数据集查询的问题。

Meeting the same problem and the last comment was from a month ago, so here is what I found out about heavy dataset query.

我想我会在文章中阅读Google文档中的这些行后,使用查询光标技术在这里提到的python):

I guess I'm gonna use the "Query cursor" technique after reading those lines in the google docs article (the one in python mentioned by the way) :


这篇文章是为SDK 1.1.7编写的。从版本1.3.1开始,
查询游标( Java a> | Python )已经取代了所描述的技术
下面,现在是推荐的通过大型
数据集进行分页的方法。

This article was written for SDK version 1.1.7. As of release 1.3.1, query cursors (Java | Python) have superseded the techniques described below and are now the recommended method for paging through large datasets.

在google文档中关于查询光标
文档的第一行给出了为什么需要游标

In the google docs about "Query Cursor". The first line of the doc gives precisely why the need for cursor :


查询游标允许一个应用程序执行查询并检索一批
结果,然后在
后续web 请求中为同一查询获取额外结果,而不会产生查询偏移量的开销。

该文档还提供了使用游标技术的servlet的 java示例。有一个提示如何为客户端生成一个安全的光标。最后, 的局限性被公开。

The documentation provides also a java example of a servlet using the cursor technique. There is a tip how to generate a safe cursor for the client. Finally, limitations of cursor are exposed.

希望这可以帮助您解决问题。

Hope this gives you a lead to resolve your problem.

小范围和偏移的提醒,如果忘记了,会对性能产生很大的影响(并且我做了^^):

Small reminder about range and offset, quite impacting on performance if forgotten (and I did^^) :


起始偏移对性能有影响:数据存储
必须检索并在开始
偏移之前丢弃所有结果。例如,范围为5,10的查询从数据存储中获取10个结果
,然后丢弃前5个结果,并将剩下的5个bb返回给应用程序。

The starting offset has implications for performance: the Datastore must retrieve and then discard all results prior to the starting offset. For example, a query with a range of 5, 10 fetches ten results from the Datastore, then discards the first five and returns the remaining five to the application.






编辑:与JDO一起工作时,我一直在寻找一种方法来允许我以前的代码加载超过1000个结果一个查询。所以,如果您也使用JDO,我发现这个旧的问题

Query query = pm.newQuery(...);
// I would use of value below 1000 (gae limit) 
query.getFetchPlan().setFetchSize(numberOfRecordByFetch); 

这篇关于对于Google App Engine(java),如何在FetchOptions中设置和使用块大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆