芹菜为什么要在成千上万的任务添加到Rabbitmq队列,这些队列似乎在任务完成后仍会持续很长时间? [英] Why does celery add thousands of queues to rabbitmq that seem to persist long after the tasks completel?
问题描述
我正在将芹菜与Rabbitmq后端一起使用。它在rabbitmq中产生数千个包含0或1个项目的队列,如下所示:
I am using celery with a rabbitmq backend. It is producing thousands of queues with 0 or 1 items in them in rabbitmq like this:
$ sudo rabbitmqctl list_queues
Listing queues ...
c2e9b4beefc7468ea7c9005009a57e1d 1
1162a89dd72840b19fbe9151c63a4eaa 0
07638a97896744a190f8131c3ba063de 0
b34f8d6d7402408c92c77ff93cdd7cf8 1
f388839917ff4afa9338ef81c28aad75 0
8b898d0c7c7e4be4aa8007b38ccc00ea 1
3fb4be51aaaa4ac097af535301084b01 1
这似乎效率低下,但我进一步观察到,这些队列在处理完成后仍会持续很长时间。
This seems to be inefficient, but further I have observed that these queues persist long after processing is finished.
我发现了似乎正在执行此任务的任务:
I have found the task that appears to be doing this:
@celery.task(ignore_result=True)
def write_pages(page_generator):
g = group(render_page.s(page) for page in page_generator)
res = g.apply_async()
for rendered_page in res:
print rendered_page # TODO: print to file
似乎因为这些任务在一个组中被调用,所以它们被扔进了队列,但从未被释放。但是,我显然正在消耗结果(因为当我遍历 res
时可以看到它们的打印结果。因此,我不理解为什么这些任务仍在队列中保留。
It seems that because these tasks are being called in a group, they are being thrown into the queue but never being released. However, I am clearly consuming the results (as I can view them being printed when I iterate through res
. So, I do not understand why those tasks are persisting in the queue.
此外,我想知道是否正在创建的大量队列表明我在做错什么。
Additionally, I am wondering if the large number queues that are being created is some indication that I am doing something wrong.
感谢您提供任何帮助!
推荐答案
带有AMQP后端的Celery将任务墓碑(结果)存储在AMQP队列以产生结果的任务ID命名。即使在清空结果之后,这些队列也将持续存在。
Celery with the AMQP backend will store task tombstones (results) in an AMQP queue named with the task ID that produced the result. These queues will persist even after the results are drained.
一些建议:
- 对每个任务都适用true。不要依赖其他任务的结果。
- 切换到其他后端(也许Redis -仍然更有效): http://docs.celeryproject.org/en/latest /用户指南/tasks.html
- Apply ignore_result=True to every task you can. Don't depend on results from other tasks.
- Switch to a different backend (perhaps Redis -- it's more efficient anyway): http://docs.celeryproject.org/en/latest/userguide/tasks.html
这篇关于芹菜为什么要在成千上万的任务添加到Rabbitmq队列,这些队列似乎在任务完成后仍会持续很长时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!