为什么我的 Airflow 任务排队但没有运行? [英] Why are my Airflow tasks queued but not running?
问题描述
我是气流的新手,正在尝试设置气流来运行 ETL 管道.我能够安装
I am new to airflow and trying to setup airflow to run ETL pipelines. I was able to install
- 气流
- postgres
- 芹菜
- rabbitmq
我能够测试运行教程 dag.当我尝试安排作业时,调度程序能够选取它并将我可以在 UI 上看到但任务没有运行的作业排队.有人可以帮我解决这个问题吗?
I am able to test run the tutorial dag. When I try to schedule the jobs, scheduler is able to pick it up and queue the jobs which I could see on the UI but tasks are not running. Could somebody help me fix this issue?
这是我的配置文件:
[core]
airflow_home = /root/airflow
dags_folder = /root/airflow/dags
base_log_folder = /root/airflow/logs
executor = CeleryExecutor
sql_alchemy_conn = postgresql+psycopg2://xxxx.amazonaws.com:5432/airflow
api_client = airflow.api.client.local_client
[webserver]
web_server_host = 0.0.0.0
web_server_port = 8080
web_server_worker_timeout = 120
worker_refresh_batch_size = 1
worker_refresh_interval = 30
[celery]
celery_app_name = airflow.executors.celery_executor
celeryd_concurrency = 16
worker_log_server_port = 8793
broker_url = amqp://rabbit:rabbit@x.x.x.x/rabbitmq_vhost
celery_result_backend = db+postgresql+psycopg2://postgres:airflow@xxx.amazonaws.com:5432/airflow
flower_host = 0.0.0.0
flower_port = 5555
default_queue = default
DAG:这是我使用的教程 dag
我的 dag 的开始日期是 -- 'start_date': datetime(2017, 4, 11),
and the start date for my dag is -- 'start_date': datetime(2017, 4, 11),
推荐答案
让您运行气流的所有三个组件,即:
have your run all the three components of airflow, namely:
airflow webserver
airflow scheduler
airflow worker
如果只运行前两个,任务会排队,但不会执行.气流工作者将提供实际执行 dag 的工作者.
If you only run the previous two, the tasks will be queued, but not executed. airflow worker will provide the workers that actually execute the dags.
顺便说一句,celery 4.0.2 目前与气流 1.7 或 1.8 不兼容.改用芹菜 3.
Also btw, celery 4.0.2 is not compatible with airflow 1.7 or 1.8 currently. Use celery 3 instead.
这篇关于为什么我的 Airflow 任务排队但没有运行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!