如何阻止DAG回填? catchup_by_default = False和catchup = False似乎不起作用,Airflow Scheduler无法回填 [英] How to stop DAG from backfilling? catchup_by_default=False and catchup=False does not seem to work and Airflow Scheduler from backfilling

查看:370
本文介绍了如何阻止DAG回填? catchup_by_default = False和catchup = False似乎不起作用,Airflow Scheduler无法回填的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

airflow.cfg中的设置catchup_by_default = False似乎不起作用。另外,向DAG中添加catchup = False也不起作用。

The setting catchup_by_default=False in airflow.cfg does not seem to work. Also adding catchup=False to the DAG doesn't work neither.

这里是重现问题的方法。我总是从运行 airflow resetdb 开始。取消暂停后,任务便开始回填。

Here's how to reproduce the issue. I always start from a clean slate by running airflow resetdb. As soon as I unpause the dag, the tasks start to backfill.

以下是该设置。我只是使用教学示例

Here's the setup for the dag. I'm just using the tutorial example.

default_args = {
    "owner": "airflow",
    "depends_on_past": False,
    "start_date": datetime(2018, 9, 16),
    "email": ["airflow@airflow.com"],
    "email_on_failure": False,
    "email_on_retry": False,
    "retries": 1,
    "retry_delay": timedelta(minutes=5),
}

dag = DAG("tutorial", default_args=default_args, schedule_interval=timedelta(1), catchup=False)


推荐答案

就像@dlamblin一样,并且在 docs 也是。Airflow会为最近的有效间隔创建一个DagRun。 catchup = False 将指示调度程序仅为DAG间隔系列的最新实例创建DAG运行。

Like @dlamblin mentioned and as mentioned in the docs too Airflow would create a single DagRun for the most recent valid interval. catchup=False will instruct the scheduler to only create a DAG Run for the most current instance of the DAG interval series.

虽然在使用时有一个 BUG timedelta 表示 schedule_interval ,而不是CRON表达式或CRON预设。这已在Airflow Master中通过 https://github.com/apache/airflow/pull/修复。 8776 。我们将通过此修复程序发布Airflow 1.10.11。

Although there was a BUG when using a timedelta for schedule_interval instead of a CRON expression or CRON preset. This has been fixed in Airflow Master with https://github.com/apache/airflow/pull/8776. We will release Airflow 1.10.11 with this fix.

这篇关于如何阻止DAG回填? catchup_by_default = False和catchup = False似乎不起作用,Airflow Scheduler无法回填的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆