使用Google App Engine进行高频数据刷新 [英] High frequency data refresh with Google App Engine

查看:146
本文介绍了使用Google App Engine进行高频数据刷新的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为Android客户端开发一个使用GAE的服务,并且需要定期刷新应用程序数据,例如每分钟一次。



在架构方面,这是应用程序的工作方式:




  • 用户启动应用程序并从GAE上运行的服务中检索市场数据
  • GAE服务本身调用外部Web服务来检索市场数据,过滤结果并发送给用户显示

  • 市场价格应该每分钟更新一次



我知道GAE为自动化计划任务提供cron作业,但从我所了解的情况来看,它不适合这样的高频任务(甚至不支持)



我可以用这个用例的最佳实践/工具是什么?

另外,是否推荐更新背景不管应用程序是否被打开?或只是在用户启动后立即更新?



我也想知道每分钟提取数据是否正确,或者服务是否推动?



预先感谢您。

解决方案

它不适合这样的高频任务(甚至不支持) - 这不完全正确。



Cron作业可以以低至1分钟的间隔运行,请参阅计划格式


以下是时间表的例子:

 每12小时
每5分钟从10:00到14:00
每天00:00
每个星期一09:00
第二,第三周一,周三,三月四日17:00
第一个星期一,十月,十一月17:00
1月1日,4月,7月,10月00:00

如果您不需要运行在特定时间重复使用的工作,但
只需要定期运行它,使用以下格式:

 每N(小时|分钟|分钟)[从(时间)到(时间)] 


因此,每隔1分钟可以使用:

 每1分钟

如果您需要的时间间隔少于1分钟,您可以使用延期库 - 任务可以延迟排队时刻,时间值以秒为单位:


deferred.defer(do_something_expensive,Foobie bletch,12,
_countdown = 30 ,_queue =myqueue)


最后一个问题的答案实际上取决于您希望应用程序的行为方式:在客户端应用程序中让数据立即可用于客户端应用程序开始或让客户端应用程序等待你只是后端收集数据。如果您只是将收集的数据中继到客户端,那么无论哪种方式都没问题,没有常规做法(除了持续更新的成本较高,当然)。但是,如果您打算也提供处理历史数据的结果,则可能需要持续更新(或者仅在市场营业时间内)。

更新:



Task Queues 比延期库更可取,延期功能可以使用可选的倒计时 eta 参数传递给 taskqueue.add()



  • 倒计时 - 此任务应运行的时间出租。默认为零。如果
    指定了eta,则不要指定此参数。


  • eta - A datetime.datetime ,它指定任务应该运行的绝对最早时间。如果指定了
    倒数参数,则不能指定此参数。该参数可以是时间
    区域感知或时区初始,或设置为过去的时间。如果
    参数设置为None,则默认值为现在。对于拉取任务,
    工作人员可以在eta
    参数指定的时间之前租用任务。

  • >

    I'm developing a service using GAE for android clients and need to refresh application data on a regular basis, say once a minute.

    In terms of architecture, this is how the application works:

    • User launches app and retrieves market data from service running on GAE
    • The GAE service itself calls an external web service to retrieve market data, filters results and sends for user display
    • The market prices should get updated once a minute

    I know GAE offers cron jobs for automated scheduled tasks, but from what I understood it's not suitable for such high frequency tasks (or not even supported)

    What's the best practice/tools I can use for this use case?

    Also, is it recommended to update data in the background regardless of application being open? Or just update as soon as the user launches?

    [EDIT] I would also like to know if pulling data every minute is the right approach, or should the service push instead?

    Thank you in advance.

    解决方案

    "it's not suitable for such high frequency tasks (or not even supported)" - this is not exactly correct.

    Cron jobs can run at intervals as low as 1 minute, see The schedule format:

    The following are examples of schedules:

    every 12 hours
    every 5 minutes from 10:00 to 14:00
    every day 00:00
    every monday 09:00
    2nd,third mon,wed,thu of march 17:00
    1st monday of sep,oct,nov 17:00
    1 of jan,april,july,oct 00:00
    

    If you don't need to run a recurring job at a specific time, but instead only need to run it at regular intervals, use the form:

    every N (hours|mins|minutes) ["from" (time) "to" (time)]
    

    So for 1 min interval you can use:

    every 1 minutes
    

    If you need lower than 1 minute intervals you can use the deferred library - tasks can be delayed from the enqueueing moment with time values specified in seconds:

    deferred.defer(do_something_expensive, "Foobie bletch", 12, _countdown=30, _queue="myqueue")

    The answer to the last questions really depends on how you want your app to behave: have the data immediately available for the client app when the client app starts or have the client app wait until the backend collects the data.

    If you're just relaying the collected data to the client either way is fine, there is no "common practice" (other than driven by higher costs for constant updates, of course). But if you plan to also offer results from processing historic data you'll probably have to go with constant updates (or maybe just during the market open hours).

    Update:

    The Task Queues are preferable to the deferred library, the deferred functionality is available using the optional countdown or eta arguments to taskqueue.add():

    • countdown -- Time in seconds into the future that this task should run or be leased. Defaults to zero. Do not specify this argument if you specified an eta.

    • eta -- A datetime.datetime that specifies the absolute earliest time at which the task should run. You cannot specify this argument if the countdown argument is specified. This argument can be time zone-aware or time zone-naive, or set to a time in the past. If the argument is set to None, the default value is now. For pull tasks, no worker can lease the task before the time indicated by the eta argument.

    这篇关于使用Google App Engine进行高频数据刷新的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆