并发,工作人员和自动缩放之间的芹菜差异 [英] Celery difference between concurrency, workers and autoscaling

查看:45
本文介绍了并发,工作人员和自动缩放之间的芹菜差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的 / etc / defaults / celeryd 配置文件中,设置:

In my /etc/defaults/celeryd config file, I've set:

CELERYD_NODES="agent1 agent2 agent3 agent4 agent5 agent6 agent7 agent8"
CELERYD_OPTS="--autoscale=10,3 --concurrency=5"

我知道守护程序会产生8个芹菜工人,但是我完全不确定 autoscale 是什么,以及并发一起做。我认为并发是一种指定工作人员可以使用的最大线程数的方法,而自动伸缩是工作人员在必要时扩展和缩减子级工作人员的一种方法。

I understand that the daemon spawns 8 celery workers, but I'm fully not sure what autoscale and concurrency do together. I thought that concurrency was a way to specify the max number of threads that a worker can use and autoscale was a way for the worker to scale up and down child workers, if necessary.

这些任务的负载比较大(大约20-50kB),大约有2-3百万个这样的任务,但是每个任务的运行时间不到一秒钟。我看到内存使用量激增,因为代理将任务分配给每个工作人员,从而多次复制了有效负载。

The tasks have a largish payload (some 20-50kB) and there are like 2-3 million such tasks, but each task runs in less than a second. I'm seeing memory usage spike up because the broker distributes the tasks to every worker, thus replicating the payload multiple times.

我认为问题出在配置中,worker +并发+ autoscaling的组合过多,我想更好地了解这三个选项的作用

I think the issue is in the config and that the combination of workers + concurrency + autoscaling is excessive and I would like to get a better understanding of what these three options do.

推荐答案

让我们区分worker和worker进程。您产生了一个芹菜工作者,然后产生了许多过程(取决于-concurrency -autoscale ,默认值是生成与计算机核心数量一样多的进程)。除非您要进行路由,否则在特定计算机上运行不止一个工作进程是没有意义的。

Let's distinguish between workers and worker processes. You spawn a celery worker, this then spawns a number of processes (depending on things like --concurrency and --autoscale, the default is to spawn as many processes as cores on the machine). There is no point in running more than one worker on a particular machine unless you want to do routing.

我建议每台计算机仅运行1个工作进程,默认数量为流程。通过消除工作人员之间的数据重复,这将减少内存使用。

I would suggest running only 1 worker per machine with the default number of processes. This will reduce memory usage by eliminating the duplication of data between workers.

如果仍然存在内存问题,则将数据保存到存储中,并仅将ID传递给工作人员。

If you still have memory issues then save the data to a store and pass only an id to the workers.

这篇关于并发,工作人员和自动缩放之间的芹菜差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆