如果传感器数量大于并发数量,气流芹菜工人是否会被阻塞? [英] Airflow celery worker will be blocked if sensor number large than concurrency?

查看:107
本文介绍了如果传感器数量大于并发数量,气流芹菜工人是否会被阻塞?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

比方说,我将celery并发设置为 n ,但是我有 m m > n )在一个外部变量中,ExternalTask​​Sensor会检查另一个名为 do_sth 的域,这些ExternalTask​​Sensor会消耗所有芹菜工人,因此实际上没人能工作。



但是我不能将并发设置得太高(例如2 * m ),因为dag do_sth 可能会启动太多进程,从而导致内存不足。



我很困惑应该为芹菜并发设置什么数字?

解决方案

有关Airflow Gotchas的ETL最佳实践部分中,作者解决了这个普遍的问题。建议之一是为传感器任务设置一个池,以免其他任务饿死。根据您的情况,确定一次要运行的传感器任务的数量(小于并发级别),并设置一个池作为限制。设置池后,将pool参数传递给每个传感器操作员。
有关池的更多信息,请参见 Airflow有关概念的文档。以下是将池参数传递给运算符的示例:

  aggregate_db_message_job = BashOperator(
task_id ='aggregate_db_message_job',
execution_timeout = timedelta(hours = 3),
pool ='ep_data_pipeline_db_msg_agg',
bash_command = aggregate_db_message_job_cmd,dag = dag)


Let's say, I set celery concurrency to n, but I have m(m>n) ExternalTaskSensor in a dag, it will check another dag named do_sth, these ExternalTaskSensor will consume all celery worker, so that no one will work in fact.

But I can't set concurreny too high(like 2*m), because dag do_sth may start too many process which will lead to out of memory.

I am confused what number I should set to celery concurrency?

解决方案

In ETL best practices with Airflow's Gotchas section the author addresses this general problem. One of the suggestions is to setup a pool for your sensor tasks so that your other tasks don't get starved. For your situation determine the number of sensor tasks that you want running at one time (less than your concurrency level) and setup a pool with that as a limit. Once your pool is setup pass the pool argument to each of your sensor operators. For more on pools see Airflow's documentation on concepts. Here is an example of passing a pool argument to an operator:

aggregate_db_message_job = BashOperator( 
    task_id='aggregate_db_message_job', 
    execution_timeout=timedelta(hours=3), 
    pool='ep_data_pipeline_db_msg_agg',
    bash_command=aggregate_db_message_job_cmd, dag=dag)

这篇关于如果传感器数量大于并发数量,气流芹菜工人是否会被阻塞?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆