了解芹菜任务预取 [英] Understanding celery task prefetching

查看:85
本文介绍了了解芹菜任务预取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚发现有关配置选项 CELERYD_PREFETCH_MULTIPLIER 文档)。默认值为4,但是(我相信)我希望预取尽可能少。我现在将其设置为1,这与我要查找的值足够接近,但是仍然有些我不理解的东西:

I just found out about the configuration option CELERYD_PREFETCH_MULTIPLIER (docs). The default is 4, but (I believe) I want the prefetching off or as low as possible. I set it to 1 now, which is close enough to what I'm looking for, but there's still some things I don't understand:


  1. 为什么这样预取一个好主意?我并没有真正找到原因,除非消息队列和工作线程之间存在大量延迟(就我而言,它们当前正在同一主机上运行,​​最糟糕的是最终可能在同一数据中的不同主机上运行)中央)。该文档仅提到了缺点,但没有说明优点是什么。

  1. Why is this prefetching a good idea? I don't really see a reason for it, unless there's a lot of latency between the message queue and the workers (in my case, they are currently running on the same host and at worst might eventually run on different hosts in the same data center). The documentation only mentions the disadvantages, but fails to explain what the advantages are.

许多人似乎将其设置为0,希望能够将其关闭。以这种方式预取(我认为这是一个合理的假设)。但是,0表示无限的预取。为什么有人会想要无限的预取,而这并不能完全消除您最初为任务队列引入的并发/异步性呢?

Many people seem to set this to 0, expecting to be able to turn off prefetching that way (a reasonable assumption in my opinion). However, 0 means unlimited prefetching. Why would anyone ever want unlimited prefetching, doesn't that entirely eliminate the concurrency/asynchronicity you introduced a task queue for in the first place?

为什么可以预取无法关闭?在大多数情况下,关闭性能可能不是一个好主意,但是有没有技术上的理由无法做到这一点?还是只是没有实现?

Why can prefetching not be turned off? It might not be a good idea for performance to turn it off in most cases, but is there a technical reason for this not to be possible? Or is it just not implemented?

有时,此选项连接到 CELERY_ACKS_LATE 。例如。 罗杰胡写道«[…],[用户]经常真正想要的是让一个工人仅保留与子进程一样多的任务。但是,如果不启用较晚的确认,就不可能做到这一点[…]»我不明白这两个选项是如何连接的,以及为什么没有另一个选项就不可能实现。可以在此处。有人可以解释为什么将这两个选项连接在一起吗?

Sometimes, this option is connected to CELERY_ACKS_LATE. For example. Roger Hu writes «[…] often what [users] really want is to have a worker only reserve as many tasks as there are child processes. But this is not possible without enabling late acknowledgements […]» I don't understand how these two options are connected and why one is not possible without the other. Another mention of the connection can be found here. Can someone explain why the two options are connected?


推荐答案


  1. 预取可以提高性能。工人无需等待经纪人的下一条消息来处理。与代理进行一次通信并处理大量消息可提高性能。与本地内存访问相比,从代理(甚至从本地代理)获取消息的成本很高。允许工作人员批量确认消息

  1. Prefetching can improve the performance. Workers don't need to wait for the next message from a broker to process. Communicating with a broker once and processing a lot of messages gives a performance gain. Getting a message from a broker (even from a local one) is expensive compared to the local memory access. Workers are also allowed to acknowledge messages in batches

将预取设置为零意味着没有特定限制而不是无限

Prefetching set to zero means "no specific limit" rather than unlimited

将预取设置为1等效于将其关闭,但是并非总是如此(请参阅 https://stackoverflow.com/a/33357180/71522

Setting prefetching to 1 is documented to be equivalent to turning it off, but this may not always be the case (see https://stackoverflow.com/a/33357180/71522)

预取允许分批确认消息。 CELERY_ACKS_LATE = True可以防止在到达工人时确认消息

Prefetching allows to ack messages in batches. CELERY_ACKS_LATE=True prevents acknowledging messages when they reach to a worker

这篇关于了解芹菜任务预取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆