GAE-将任务添加到队列的最快方法是什么?为什么这看起来这么慢? [英] GAE - What is the fastest way to add tasks to queue? Why does this appear to be so slow?

查看:113
本文介绍了GAE-将任务添加到队列的最快方法是什么?为什么这看起来这么慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Google App Engine(Python)实时处理一些事件消息.简而言之,我有100多个任务,在收到消息时需要快速运行.我尝试了几种方法(延迟库,线程),并且我认为最好的解决方案是使用任务队列并将这些任务异步添加到队列中我想.这是我在做什么的示例.

I am using Google App Engine (Python) to process some event messages in real time. In short I have 100+ tasks that I need to run fast when a message comes in. I have tried a few approaches (deferred library, threads) and I think the best solution involves using the task queue and asynchronously adding these tasks to the queue I want. Here's an example of what I am doing.

tasks = []
task = Task(url=url_for('main.endpoints_worker'),params={'id': id})
tasks.append(task.add_async(queue_name='event-message'))

for task in tasks:
    task.get_result()

执行此操作时,我大部分时间都花在了将这些任务添加到队列中.有没有办法加快速度?有更好的方法吗?

When I do this most of my time is spent adding these tasks to the queue. Is there a way to speed this up? Is there a better approach?

现在说实话,每次运行此命令,我得到的时间都大不相同.有时我大约100毫秒(这很好),但其他时候我大约1秒钟.

Now to be honest I get vastly different times each time I run this. Sometimes I am around 100ms (which would be fine) but other times I am up around 1s.

我本来以为分散工作会更快,但是大量添加到任务队列中却可以执行.使用下面的建议方法,这就是我所看到的:

I would have thought spreading out the work would have been faster but bulk adding to the task queue out performs. With suggested approach below here is what I am seeing:

tasks = [Task(url=url_for('main.endpoints_worker'),params={'id': id}) for id in id_list]
rpc = Queue('event-message').add_async(tasks)
rpc.get_result()

更新:由于添加到队列中的任务限制为100,我需要再次检查此问题. 通过分批创建任务(每组100个),我极大地提高了代码的吞吐量,但是我仍然不明白为什么将多个任务组添加到队列会导致速度如此之慢.一个任务队列.add_async运行< 40ms没问题.当我执行2个或更多queue.add_async时,该时间会变慢.我想知道为什么?还有如何解决这个问题?

UPDATE: I need to examine this issue again due to the 100 task limit when adding to a queue. I have greatly improved the throughput of my code by batching the creation of my tasks (groups of 100) but I still don't understand why adding multiple groups of tasks to a queue slows down so quickly. One task queue.add_async runs < 40ms no issue. When I do 2 or more queue.add_async that time slows down. I would love to know why? Also how do I get around this?

当我添加一批不同步的任务时,它们各自占用< 40毫秒为什么使用异步时它们需要花费更长的时间?

When I add batches of tasks without async they each take < 40ms. Why do they take so much longer when async is used?

另一个更新,我认为问题可能与争用有关,但是即使将这些任务添加到不同的队列中,我也会得到相同的结果.

Another Update I thought the issue may be contention related but even when I add each of these tasks to a different queue I get the same results.

推荐答案

您可以通过成批排队任务来节省大量时间.像下面这样的东西应该适合您:

You can save a ton of time by queuing your tasks in batches. Something like the following should work for you:

tasks = [Task(url=url_for('main.endpoints_worker'),params={'id': id}) for id in id_list]
rpc = Queue('event-message').add_async(tasks)
rpc.wait()

请注意,您不能使用延迟库批量提交任务.

Note that you can't submit tasks in batches with the deferred library.

这篇关于GAE-将任务添加到队列的最快方法是什么?为什么这看起来这么慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆