在Airflow中设置S3日志记录 [英] Setting up S3 logging in Airflow

查看:465
本文介绍了在Airflow中设置S3日志记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这让我发疯。

我正在云环境中设置气流。我有一台服务器运行调度程序和网络服务器,一台服务器作为芹菜工作者,并且正在使用气流1.8.0。

I'm setting up airflow in a cloud environment. I have one server running the scheduler and the webserver and one server as a celery worker, and I'm using airflow 1.8.0.

运行作业可以正常工作。 拒绝起作用的是日志记录。

Running jobs works fine. What refuses to work is logging.

我已经在两台服务器上的airflow.cfg中设置了正确的路径:

I've set up the correct path in airflow.cfg on both servers:


remote_base_log_folder = s3:// my-bucket / airflow_logs /

remote_base_log_folder = s3://my-bucket/airflow_logs/

remote_log_conn_id = s3_logging_conn

remote_log_conn_id = s3_logging_conn

我已经在气流UI中设置了s3_logging_conn,并使用此处

I've set up s3_logging_conn in the airflow UI, with the access key and the secret key as described here.

我使用


s3 = airflow.hooks.S3Hook('s3_logging_conn')

s3 = airflow.hooks.S3Hook('s3_logging_conn')

s3。 load_string('test','test',bucket_name ='my-bucket')

s3.load_string('test','test',bucket_name='my-bucket')

可行在两个服务器上。因此,连接已正确设置。

This works on both servers. So the connection is properly set up. Yet all I get whenever I run a task is


***日志文件不是本地文件。

*** Log file isn't local.

***在此处获取:http:// *******

*** Fetching here: http://*******

***无法从worker中获取日志文件。

*** Failed to fetch log file from worker.

***正在读取远程日志...

*** Reading remote logs...

无法从s3:// my-中读取日志bucket / airflow_logs / my-dag / my-task / 2018-02-15T21:46:47.577537

Could not read logs from s3://my-bucket/airflow_logs/my-dag/my-task/2018-02-15T21:46:47.577537

我尝试手动上传日志遵循预期的约定,Web服务器仍然无法接收它-因此问题出在两端。我不知所措,到目前为止,我所读的一切都告诉我这应该有效。我即将安装1.9.0,可以听到更改日志记录,看看自己是否更幸运。

I tried manually uploading the log following the expected conventions and the webserver still can't pick it up - so the problem is on both ends. I'm at a loss at what to do, everything I've read so far tells me this should be working. I'm close to just installing the 1.9.0 which I hear changes logging and see if I'm more lucky.

更新:我完成了Airflow 1.9的全新安装并按照此处的具体说明进行操作。

UPDATE: I made a clean install of Airflow 1.9 and followed the specific instructions here.

Web服务器不会甚至现在从以下错误开始:

Webserver won't even start now with the following error:


airflow.exceptions.AirflowConfigException:在配置中找不到部分/密钥[core / remote_logging]

airflow.exceptions.AirflowConfigException: section/key [core/remote_logging] not found in config

此配置模板

所以我尝试删除它并仅加载S3处理程序而不先检查,而是得到以下错误消息:

So I tried removing it and just loading the S3 handler without checking first and I got the following error message instead:


无法加载配置,包含配置错误。

Unable to load the config, contains a configuration error.

追溯(最近一次通话最近):

Traceback (most recent call last):

文件 /usr/lib64/python3.6/logging/config.py,行384,处于解析状态:

File "/usr/lib64/python3.6/logging/config.py", line 384, in resolve:

self.importer(二手)

self.importer(used)

ModuleNotFoundError:未命名模块

ModuleNotFoundError: No module named

'airflow.utils.log.logging_mixin.RedirectStdHandler';

'airflow.utils.log.logging_mixin.RedirectStdHandler';

'airflow.utils.log.logging_mixin'不是软件包

'airflow.utils.log.logging_mixin' is not a package

我感觉这不应该那么困难。

I get the feeling that this shouldn't be this hard.

任何帮助,不胜感激,

推荐答案

已解决:


  1. 已升级到1.9

  2. 运行了此评论中中描述的步骤

  3. 已添加

  1. upgraded to 1.9
  2. ran the steps described in this comment
  3. added


[core]

[core]

remote_logging = True

remote_logging = True

至airflow.cfg

to airflow.cfg


点安装--upgrade airflow [log]

pip install --upgrade airflow[log]


现在一切正常。

这篇关于在Airflow中设置S3日志记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆