Airflow调度程序能否在开始下一天的工作之前先完成前一天的工作? [英] Is it possible for Airflow scheduler to first finish the previous day's cycle before starting the next?
问题描述
现在,我DAG中的节点将继续执行第二天的任务,直到该DAG的其余节点完成。有没有办法让它等待DAG的其余部分完成然后再进入第二天的DAG周期?
Right now, nodes in my DAG proceeds to the next day's task before the rest of the nodes of that DAG finishes. Is there a way for it to wait for the rest of the DAG to finish before moving unto the next day's DAG cycle?
(我确实将depends_on_past设为true,但是那是真的)在这种情况下不起作用)
(I do have depends_on_past as true, but that does not work in this case)
我的DAG看起来像这样:
My DAG looks like this:
O
l
V
O -> O -> O -> O -> O
另外,dag的树状图]
Also, tree view pic of the dag]
推荐答案
这个答案可能会有点晚,但是我遇到了同样的问题,而解决方法是在每个dag中添加了两个额外的任务。开头为上一个,结尾为完成。前一个任务是外部任务传感器,它监视前一个任务。 Complete只是一个虚拟运算符。可以说它每30分钟运行一次,因此dag如下所示:
Might be a bit late for this answer, but I ran into the same issue and the way I resolved it is I added two extra tasks in each dag. "Previous" at the start and "Complete" at the end. Previous task is external task sensor which monitors previous job. Complete is just a dummy operator. Lets say it runs every 30 minutes so the dag would look like this:
dag = DAG(dag_id='TEST_DAG', default_args=default_args, schedule_interval=timedelta(minutes=30))
PREVIOUS = ExternalTaskSensor(
task_id='Previous_Run',
external_dag_id='TEST_DAG',
external_task_id='All_Tasks_Completed',
allowed_states=['success'],
execution_delta=timedelta(minutes=30),
dag=DAG
)
T1 = BashOperator(
task_id='TASK_01',
bash_command='echo "Hello World from Task 1"',
dag=dag
)
COMPLETE = DummyOperator(
task_id='All_Tasks_Completed',
dag=DAG
)
PREVIOUS >> T1 >> COMPLETE
所以下一个中断,即使它进入队列,也不会让任务运行直到完成PREVIOUS。
So the next dag, even tho it will come into the queue, it will not let tasks run until PREVIOUS is completed.
这篇关于Airflow调度程序能否在开始下一天的工作之前先完成前一天的工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!