如何定义将每月工作与日常工作一起调度的DAG? [英] How to define a DAG that scheduler a monthly job together with a daily job?

查看:123
本文介绍了如何定义将每月工作与日常工作一起调度的DAG?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须


  • 每月更新表 Foo

  • 和另一个表 Bar 每天

  • 并每天连接这两个表,并将结果插入第三个表 Bazz

  • update a table Foo monthly
  • and another table Bar daily
  • and join these two tables daily and insert the result into a third table Bazz

是可以配置为


  • Foo 在某天(例如第5天)进行更新,

  • Bar 每天更新

  • ,它们在同一个DAG中吗?

  • Foo is updated on certain day (say 5th),
  • while Bar is updated daily
  • and they are in the same DAG?

推荐答案

此行为可以在DAG中使用以下任一替代方法实现

This behaviour can be achieved within single DAG using either of following alternatives

  • ShortCircuitOperator
  • AirflowSkipException (better in my opinion)

基本上,您的DAG仍每天运行( schedule_interval = '@daily'),但是

Basically, your DAG would still run each day (schedule_interval='@daily'), but


  • 每天,只有您的 Bar 任务将运行,而 Foo 会被跳过(或短路);

  • 直到某个特定日期(例如每月的5号)都n。

  • on a daily basis, only your Bar task would run while Foo would get skipped (or short-circuited);
  • until on some particular day (like 5th of each month) when both would run.

当然,您也可以将它们建模为单独的DAG,并且< a href = https://stackoverflow.com/q/51325525/3679900>将它们链接在一起(而不是单个DAG中的单个任务)。只要您链接在一起的DAG数量很小,此选择可能会更好

You can, of course, also model these as separate DAGs and chain them together (rather than individual tasks within a single DAG). This choice might be better as long as the number of DAGs that you are linking together is small.

这篇关于如何定义将每月工作与日常工作一起调度的DAG?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆