气流:找不到dag_id [英] Airflow: dag_id could not be found

查看:136
本文介绍了气流:找不到dag_id的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在不同的AWS机器上运行气流服务器和工作程序。
我已经同步了它们之间的dags文件夹,在两者上都运行了 airflow initdb ,并在我运行时检查了dag_id是否相同airflow list_tasks< dag_id>

I'm running an airflow server and worker on different AWS machines. I've synced that dags folder between them, ran airflow initdb on both, and checked that the dag_id's are the same when I run airflow list_tasks <dag_id>

运行调度程序和工作程序时,我在工作程序上遇到此错误:

When I run the scheduler and worker, I get this error on the worker:


airflow.exceptions.AirflowException:找不到dag_id:。 dag不存在或无法解析。 [...]命令...-- local -sd /home/ubuntu/airflow/dags/airflow_tutorial.py'

airflow.exceptions.AirflowException: dag_id could not be found: . Either the dag did not exist or it failed to parse. [...] Command ...--local -sd /home/ubuntu/airflow/dags/airflow_tutorial.py'

问题所在似乎是路径错误(/home/ubuntu/airflow/dags/airflow_tutorial.py),因为正确的路径是/home/hadoop/...

What seems to be the problem is that the path there is wrong (/home/ubuntu/airflow/dags/airflow_tutorial.py) since the correct path is /home/hadoop/...

在服务器上,路径使用ubuntu,但是在两个配置文件中,它只是〜/ airflow /...

On the server machine the path is with ubuntu, but on both config files it's simply ~/airflow/...

是什么使工人在这条道路上看起来不正确?

What makes the worker look in this path and not the correct one?

我如何看待它

编辑:


  • 不太可能出现配置问题。我已经运行 grep -R ubuntu ,并且唯一出现在日志中

  • 当我在具有 ubuntu 作为用户,一切正常。这使我相信,由于某种原因,气流为工人提供了任务的完整路径

  • It's unlikely a config problem. I've ran grep -R ubuntu and the only occurrences are in the logs
  • When I run the same on a computer with ubuntu as a user everything works. Which leads me to believe that for some reason airflow provides the worker with the full path of the task

推荐答案

airflow run 命令中添加-raw 参数有助于我了解最初的异常。就我而言,元数据数据库实例太慢,并且由于超时而导致加载失败。我已通过以下方式解决它:

Adding --raw parameter to the airflow run command helped me to see what was the original exception. In my case, the metadata database instance was too slow, and loading dags failed because of a timeout. I've fixed it by:


  • 升级数据库实例

  • 增加参数 dagbag_import_timeout

  • Upgrading database instance
  • Increasing parameter dagbag_import_timeout in airflow.cfg

希望这会有所帮助!

这篇关于气流:找不到dag_id的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆