气流DAG每隔一分钟运行一次 [英] Airflow DAG Running Every Second Rather Than Every Minute

查看:331
本文介绍了气流DAG每隔一分钟运行一次的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将DAG安排为每分钟运行一次,但它似乎每秒都在运行。根据我读过的所有内容,我只需要在DAG中包括 schedule_interval ='* / 1 * * * *',#..每1分钟就可以了但它不起作用。在这里,我设置一个简单的示例对其进行测试:

I'm trying to schedule my DAG to run every minute but it seems to be running every second instead. Based on everything I've read I should just need to include schedule_interval='*/1 * * * *', #..every 1 minute in my DAG and that's it but it's not working. Here a simple example I setup to test it out:

from airflow import DAG
from airflow.operators import SimpleHttpOperator, HttpSensor, EmailOperator, S3KeySensor
from datetime import datetime, timedelta
from airflow.operators.bash_operator import BashOperator

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(2018, 6, 4),
    'schedule_interval': '*/1 * * * *', #..every 1 minute
    'email': ['airflow@airflow.com'],
    'email_on_failure': True,
    'email_on_retry': False,
    'retries': 2,
    'retry_delay': timedelta(minutes=1)
}

dag = DAG(
    dag_id='airflow_slack_example',
    start_date=datetime(2018, 6, 4),
    max_active_runs=3,
    schedule_interval='*/1 * * * *', #..every 1 minute
    default_args=default_args,
)

test= BashOperator(
    task_id='test',
    bash_command="echo hey >> /home/ec2-user/schedule_test.txt",
    retries=1,
    dag=dag)

更新:

在与@Taylor Edmiston讨论了他的解决方案之后,我们意识到我需要这样做的原因添加 catchup = False 是因为我使用Pip安装了Airflow,而Pip使用的是过时的Airflow版本。如果您使用的是它的存储库主分支,那么您就不需要包含 catchup = False 为了使它能够像我尝试的那样每分钟运行。因此,尽管已接受的答案解决了我的问题,但还是无法解决@Taylor Edmiston发现的根本问题。

After talking with @Taylor Edmiston in regards to his solution we realized that the reason I needed to add catchup=False is because I installed Airflow using Pip which uses an outdated version of Airflow. Apparently if you're using Airflow from the master branch of it's repository then you won't need to include catchup=False in order for it to run every minute like I was trying. So although the accepted answer fixed my issue it's sort of not addressing the underlying problem that was discovered by @Taylor Edmiston.

推荐答案

尝试在 DAG()中添加 catchup = False 。可能是因为您声明的开始日期您的DAG正在尝试回填。

Try adding catchup=False in the DAG(). It might be that your DAG is trying to backfill because of the start_date that you have declared.

这篇关于气流DAG每隔一分钟运行一次的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆