如何触发每日在当地时间午夜而不是 UTC 时间午夜运行的 DAG [英] How to trigger daily DAG run at midnight local time instead of midnight UTC time

查看:38
本文介绍了如何触发每日在当地时间午夜而不是 UTC 时间午夜运行的 DAG的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 UTC+4 时区,所以当 Airflow 触发夜间 ETL 时,这里已经是凌晨 4:00.我如何告诉 Airflow 在 ds-1 天 20:00 触发 ds 天的运行,但 ds=ds?

I'm in the UTC+4 timezone, so when Airflow triggers the nightly ETLs, it's already 4:00AM here. How can I tell Airflow to trigger the run for day ds already on day ds-1 at 20:00, but with ds=ds?

根据文档,强烈建议将所有服务器保持在 UTC 上,这就是我寻找应用程序级解决方案的原因.

Per the docs it's highly recommended to keep all servers on UTC, so that's why I'm looking for an application-level solution.

一个hacky的解决方案是将它定义为每天晚上20:00运行,所以是前一天",然后在工作.但这在 Airflow UI 上看起来仍然很奇怪,因为它会显示 UTC 执行时间.

a hacky solution is to define it to run everyday at 20:00PM, so the "previous" day, but then use tomorrow_ds instead of ds in the job. But that still looks weird on the Airflow UI, because that's going to show the UTC execution time.

推荐答案

计划间隔也可以是cron 表达式",这意味着您可以轻松地在 20:00 UTC 运行它.加上user_defined_filters"意味着你可以通过一些技巧获得你想要的行为:

Schedule interval can also be a "cron expression" which means you can easily run it at 20:00 UTC. That coupled with "user_defined_filters" means you can, with a bit of trickery get the behaviour you want:

from airflow.models import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime

import pytz
tz = pytz.timezone('Asia/Dubai')


def localize_utc_tz(d):
    return tz.fromutc(d)

default_args = {
    'start_date': datetime(2017, 11, 8),
}
dag = DAG(
    'plus_4_utc',
    default_args=default_args,
    schedule_interval='0 20 * * *',
    user_defined_filters={
        'localtz': localize_utc_tz,
    },
)
task = BashOperator(
        task_id='task_for_testing_file_log_handler',
        dag=dag,
        bash_command='echo UTC {{ ts }}, Local {{ execution_date | localtz }} next {{ next_execution_date | localtz }}',
)

输出:

UTC 2017-11-08T20:00:00, Local 2017-11-09 00:00:00+04:00 next 2017-11-10 00:00:00+04:00

UTC 2017-11-08T20:00:00, Local 2017-11-09 00:00:00+04:00 next 2017-11-10 00:00:00+04:00

您必须小心使用的变量的类型".例如 dsts 是字符串,而不是日期时间对象,这意味着过滤器不会对它们起作用

You'll have to be careful about the "types" of variables you use. For instance ds and ts are strings, not datetime objects which means the filter wont work on them

这篇关于如何触发每日在当地时间午夜而不是 UTC 时间午夜运行的 DAG的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆