如何自动重新安排气流任务 [英] How to automatically reschedule airflow tasks

查看:25
本文介绍了如何自动重新安排气流任务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行一个每小时一次的进程,从一个位置(来源")获取数据并将其移动到另一个(目的地").大多数情况下,数据在特定时间到达我的原点,一切正常,但可能会有延迟,当发生这种情况时,气流中的任务失败,需要手动重新运行.解决此问题的一种方法是为数据到达留出更多时间,但我更愿意仅在确实存在延迟时才这样做.另外,我不希望传感器长时间等待数据,因为它会导致死锁(最好不要让每小时任务运行时间超过 1 小时).气流是否允许针对给定条件(失败或不存在数据)重新安排任务,以便我们不必手动重新运行失败的任务?

I am running an hourly process that picks up data from one location ("origin") and moves it to another ("destination"). for the most part, the data arrives to my origin at specific time and everything works fine, but there can be delays and when that happens, the task in airflow fails and need to be manually re-run. One way to solve this is to give more time for the data to arrive, but I prefer to do that only if there is in fact a delay. Also, I wouldn't want to have a sensor that is waiting on the data for a long time, as it can cause deadlocks (preferably not to have an hourly task running for longer than 1 hour). Does airflow allow any re scheduling of a task for a given condition (failed, or no data exists), so that we don't have to manually re-run our failed tasks?

谢谢!

推荐答案

查看 BaseOperator 的以下参数(这是所有运算符的父类):

Check out the following parameters for the BaseOperator (This is the parent class for all operators):

  • retry_delay (timedelta) – 重试之间的延迟
  • retry_exponential_backoff (bool) – 通过在重试延迟上使用指数退避算法(延迟将转换为秒),允许在重试之间逐步延长等待时间
  • max_retry_delay (timedelta) – 重试之间的最大延迟间隔

在这三个方面进行良好的组合应该可以满足您的需求.

Getting a good mix on these three should give you what you want.

https://incubator-airflow.readthedocs.io/en/latest/code.html

这篇关于如何自动重新安排气流任务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆