另一个“此DAG在Web服务器的DagBag对象中不可用” [英] Yet another “This DAG isn't available in the webserver DagBag object ”

查看:138
本文介绍了另一个“此DAG在Web服务器的DagBag对象中不可用”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这似乎是一个相当普遍的问题。我有一个DAG,不仅可以在其中使用 airflow trigger_dag 手动触发它,而且它甚至可以按照其计划执行,但它拒绝显示在UI中。 / p>

我已经多次重启Web服务器和调度程序,按刷新十亿次,然后通过气流回填 code>。还有其他想法吗?我可以提供任何其他相关信息吗?



我正在使用Airflow 1.9.0。

解决方案

最近几个小时,我一直在调试此确切的问题。这似乎是由于DAG中的无提示错误所致。



因此,在我的情况下,此错误是由于DAG中的以下代码块引起的:



此操作失败:

  def read_lakes_id_file_simple():
LAKES_ID_FILE = /home/airflow/gcs/data/lakes_to_monitor.json
,其中open(LAKES_ID_FILE)作为json_file:
data = json.load(json_file)
返回数据

通过:

  def read_lakes_id_file_simple():
尝试:
LAKES_ID_FILE = /home/airflow/gcs/data/lakes_to_monitor.json
与open(LAKES_ID_FILE )as json_file:
data = json.load(json_file)
返回数据
例外情况为e:
return'LOTS OF LAKES'

所以我猜第一个在调度程序读取/检查时会以某种方式失败,因为它不能ind文件,或者什么都没有,第二次成功了,因为它是由工作人员在正确的路径中运行的。 (或者可能是其他情况。)似乎很清楚的是,加载/运行DAG时有两种不同的运行和行为,一种无声地失败,而另一种成功。



这会导致奇怪的行为,例如DAG第一次运行良好,然后从Airflow Web Interface中消失。



所以我对您的建议是将 try / except 添加到可能符合要求的任何内容中,调试代码。


This seems to be a fairly common issue. I have a DAG where, not only can I trigger it manually with airflow trigger_dag, but it's even executing according to its schedule, but it refuses to show up in the UI.

I've already, restarted the webserver and scheduler multiple times, pressed "refresh" like a billion times, and ran it through airflow backfill. Anyone have any other ideas? Any other pertinent information I can provide?

I'm on Airflow 1.9.0.

解决方案

I have been debugging this exact problem for the last few hours. It seems to be due to a silent error in the DAG. Leaving my notes here for the next poor soul.

So in my case, this error was due to the following blocks of code in my DAG:

This fails:

def read_lakes_id_file_simple():
    LAKES_ID_FILE = "/home/airflow/gcs/data/lakes_to_monitor.json"
    with open(LAKES_ID_FILE) as json_file:
        data = json.load(json_file)
    return data

This passes:

def read_lakes_id_file_simple():
    try:
        LAKES_ID_FILE = "/home/airflow/gcs/data/lakes_to_monitor.json"
        with open(LAKES_ID_FILE) as json_file:
            data = json.load(json_file)
        return data
    except Exception as e:
        return 'LOTS OF LAKES'

So I'm guessing the first fails somehow when read/checked by scheduler, perhaps, as it can't find the file, or whatnot, while the second succeeds because it's run in the right path by the worker. (Or it could be something else.) What seems clear is that there are two different runs and behaviors when loading/running the DAG, and one fails silently, while the other succeeds.

This leads to bizarre behavior, such a as the DAG running fine the first time, then disappearing from the Airflow Web Interface afterwards.

So my suggestion to you is to add try/except to anything that might fit the bill, as a way of debugging your code.

这篇关于另一个“此DAG在Web服务器的DagBag对象中不可用”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆