如何>>操作员在Airflow中定义任务依赖项? [英] How >> operator defines task dependencies in Airflow?

查看:294
本文介绍了如何>>操作员在Airflow中定义任务依赖项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读Apache Airflow教程 https://github.com/hgrif/airflow-tutorial,并在本节中定义了任务依赖性.

with DAG('airflow_tutorial_v01',
     default_args=default_args,
     schedule_interval='0 * * * *',
     ) as dag:

print_hello = BashOperator(task_id='print_hello',
                           bash_command='echo "hello"')
sleep = BashOperator(task_id='sleep',
                     bash_command='sleep 5')
print_world = PythonOperator(task_id='print_world',
                             python_callable=print_world)


print_hello >> sleep >> print_world

让我感到困惑的是

print_hello >> sleep >> print_world

>>在Python中是什么意思?我知道按位运算符,但不能与此处的代码相关.

解决方案

气流将工作流表示为有向无环图.工作流是必须并行执行或顺序执行的任意数量的任务. ">>"是Airflow语法,用于将任务设置为另一个任务的下游.

进入孵化器气流项目仓库,在airflow目录中的models.py定义了许多高级Airflow的行为.如果愿意,可以深入研究其他类,但是回答您问题的一个类是BaseOperator类. Airflow中的所有运算符都继承自BaseOperator. BaseOperator类的__rshift__方法在设置另一个任务或另一个任务的DAG的上下文中实现Python右移逻辑运算符.

请参见实施此处.. >

I was going through Apache Airflow tutorial https://github.com/hgrif/airflow-tutorial and encountered this section for defining task dependencies.

with DAG('airflow_tutorial_v01',
     default_args=default_args,
     schedule_interval='0 * * * *',
     ) as dag:

print_hello = BashOperator(task_id='print_hello',
                           bash_command='echo "hello"')
sleep = BashOperator(task_id='sleep',
                     bash_command='sleep 5')
print_world = PythonOperator(task_id='print_world',
                             python_callable=print_world)


print_hello >> sleep >> print_world

The line that confuses me is

print_hello >> sleep >> print_world

What does >> mean in Python? I know bitwise operator, but can't relate to the code here.

解决方案

Airflow represents workflows as directed acyclic graphs. A workflow is any number of tasks that have to be executed, either in parallel or sequentially. The ">>" is Airflow syntax for setting a task downstream of another.

Diving into the incubator-airflow project repo, models.py in the airflow directory defines the behavior of much of the high level abstractions of Airflow. You can dig into the other classes if you'd like there, but the one that answers your question is the BaseOperator class. All operators in Airflow inherit from the BaseOperator. The __rshift__ method of the BaseOperator class implements the Python right shift logical operator in the context of setting a task or a DAG downstream of another.

See implementation here.

这篇关于如何>>操作员在Airflow中定义任务依赖项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆