由于无法读取日志文件,任务失败 [英] Task fails due to not being able to read log file

查看:31
本文介绍了由于无法读取日志文件,任务失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Composer 由于无法读取日志文件而导致任务失败,它抱怨编码不正确.

Composer is failing a task due to it not being able to read a log file, it's complaining about incorrect encoding.

这是出现在用户界面中的日志:

Here's the log that appears in the UI:

*** Unable to read remote log from gs://bucket/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** 'ascii' codec can't decode byte 0xc2 in position 6986: ordinal not in range(128)

*** Log file does not exist: /home/airflow/gcs/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Fetching from: http://airflow-worker-68dc66c9db-x945n:8793/log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-68dc66c9db-x945n', port=8793): Max retries exceeded with url: /log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1c9ff19d10>: Failed to establish a new connection: [Errno -2] Name or service not known',))

我尝试在谷歌云控制台中查看该文件,但它也抛出了一个错误:

I try viewing the file in the google cloud console and it also throws an error:

Failed to load

Tracking Number: 8075820889980640204

但我可以通过 gsutil 下载文件.

But I am able to download the file via gsutil.

当我查看文件时,它似乎有文本覆盖其他文本.

When I view the file, it seems to have text overriding other text.

我无法显示整个文件,但它看起来像这样:

I can't show the entire file but it looks like this:

--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
@-@{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,313] {models.py:1569} INFO - Executing <Task(BigQueryOperator): merge_campaign_exceptions> on 2019-08-03T10:00:00+00:00@-@{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,314] {base_task_runner.py:124} INFO - Running: ['bash', '-c', u'airflow run __campaign_exceptions_0_0_1 merge_campaign_exceptions 2019-08-03T10:00:00+00:00 --job_id 22767 --pool _bq_pool --raw -sd DAGS_FOLDER//-campaign-exceptions.py --cfg_path /tmp/tmpyBIVgT']@-@{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:24,658] {base_task_runner.py:107} INFO - Job 22767: Subtask merge_campaign_exceptions [2019-08-04 10:01:24,658] {settings.py:176} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800@-@{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}

@-@{} 片段似乎位于典型日志的顶部".

Where the @-@{} pieces seems to be "on top of" the typical log.

推荐答案

我遇到了同样的问题.就我而言,问题是我删除了用于检索日志的 google_gcloud_default 连接.

I faced the same problem. In my case the problem was that I removed the google_gcloud_default connection that was being used to retrieve the logs.

检查配置并查找连接名称.

Check the configuration and look for the connection name.

[core]
remote_log_conn_id = google_cloud_default

然后检查用于该连接名称的凭据是否具有访问 GCS 存储桶的正确权限.

Then check the credentials used for that connection name has the right permissions to access the GCS bucket.

这篇关于由于无法读取日志文件,任务失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆