为什么我会遇到“错误同步广告连播"与数据流管道? [英] Why did I encounter an "Error syncing pod" with Dataflow pipeline?

查看:169
本文介绍了为什么我会遇到“错误同步广告连播"与数据流管道?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我想使用PyPI中的特定库时,我在Dataflow管道中尝试了一个奇怪的错误.

我在ParDo中需要jsonschema,因此,在我的requirements.txt文件中,添加了jsonschema==3.2.0. 我使用以下命令行启动管道:

 python -m gcs_to_all \
    --runner DataflowRunner \
    --project <my-project-id> \
    --region europe-west1 \
    --temp_location gs://<my-bucket-name>/temp/ \
    --input_topic "projects/<my-project-id>/topics/<my-topic>" \
    --network=<my-network> \
    --subnetwork=<my-subnet> \
    --requirements_file=requirements.txt \
    --experiments=allow_non_updatable_job \
    --streaming  
 

在终端中,一切似乎都很好:

 INFO:root:2020-01-03T09:18:35.569Z: JOB_MESSAGE_BASIC: Worker configuration: n1-standard-4 in europe-west1-b.
INFO:root:2020-01-03T09:18:35.806Z: JOB_MESSAGE_WARNING: The network default doesn't have rules that open TCP ports 12345-12346 for internal connection with other VMs. Only rules with a target tag 'dataflow' or empty target tags set apply. If you don't specify such a rule, any pipeline with more than one worker that shuffles data will hang. Causes: Firewall rules associated with your network don't open TCP ports 12345-12346 for Dataflow instances. If a firewall rule opens connection in these ports, ensure target tags aren't specified, or that the rule includes the tag 'dataflow'.
INFO:root:2020-01-03T09:18:48.549Z: JOB_MESSAGE_DETAILED: Workers have started successfully.
 

在Dataflow网页上的日志"选项卡中没有错误,但是在堆栈驱动程序中没有错误:

message: "Error syncing pod 6515c378c6bed37a2c0eec1fcfea300c ("<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)"), skipping: [failed to "StartContainer" for "sdk0" with CrashLoopBackOff: "Back-off 10s restarting failed container=sdk0 pod=<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)""
message: ", failed to "StartContainer" for "sdk1" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=sdk1 pod=<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)"" 
message: ", failed to "StartContainer" for "sdk2" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=sdk2 pod=<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)"" 
message: ", failed to "StartContainer" for "sdk3" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=sdk3 pod=<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)"" 

我也发现了此错误(在信息模式下):

Collecting jsonschema (from -r /var/opt/google/staged/requirements.txt (line 1))
  Installing build dependencies: started
Looking in links: /var/opt/google/staged
  Installing build dependencies: started
Collecting jsonschema (from -r /var/opt/google/staged/requirements.txt (line 1))
  Installing build dependencies: started
Looking in links: /var/opt/google/staged
Collecting jsonschema (from -r /var/opt/google/staged/requirements.txt (line 1))
  Installing build dependencies: started
  Installing build dependencies: finished with status 'error'
  ERROR: Command errored out with exit status 1:
   command: /usr/local/bin/python3 /usr/local/lib/python3.7/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-mdurhav9/overlay --no-warn-script-location --no-binary :none: --only-binary :none: --no-index --find-links /var/opt/google/staged -- 'setuptools>=40.6.0' wheel
       cwd: None
  Complete output (5 lines):
  Looking in links: /var/opt/google/staged
  Collecting setuptools>=40.6.0
  Collecting wheel
    ERROR: Could not find a version that satisfies the requirement wheel (from versions: none)
  ERROR: No matching distribution found for wheel

但是我不知道为什么它可以得到这种依赖...

您知道我该如何调试吗?还是为什么我会遇到此错误?

谢谢

解决方案

Dataflow工作程序启动时,他们执行几个步骤:

  1. requirements.txt
  2. 安装软件包
  3. 安装指定为extra_packages
  4. 的软件包
  5. 安装工作流tarball并执行setup.py中提供的操作.

带有CrashLoopBackOff消息的

Error syncing pod可能与依赖项冲突有关.您需要验证与该作业使用的库和版本没有冲突.请参考文档以暂存所需的管道依赖项. /p>

还请查看预安装的依赖项和此 StackOverflow线程.

您可以尝试的是更改jsonschema的版本,然后尝试再次运行.如果没有帮助,请提供requirements.txt文件.

希望它能对您有所帮助.

I experiment a weird error with my Dataflow pipeline when I want to use specific library from PyPI.

I need jsonschema in a ParDo, so, in my requirements.txtfile, I added jsonschema==3.2.0. I launch my pipeline with the command line below:

python -m gcs_to_all \
    --runner DataflowRunner \
    --project <my-project-id> \
    --region europe-west1 \
    --temp_location gs://<my-bucket-name>/temp/ \
    --input_topic "projects/<my-project-id>/topics/<my-topic>" \
    --network=<my-network> \
    --subnetwork=<my-subnet> \
    --requirements_file=requirements.txt \
    --experiments=allow_non_updatable_job \
    --streaming  

In the terminal, all seems to be good:

INFO:root:2020-01-03T09:18:35.569Z: JOB_MESSAGE_BASIC: Worker configuration: n1-standard-4 in europe-west1-b.
INFO:root:2020-01-03T09:18:35.806Z: JOB_MESSAGE_WARNING: The network default doesn't have rules that open TCP ports 12345-12346 for internal connection with other VMs. Only rules with a target tag 'dataflow' or empty target tags set apply. If you don't specify such a rule, any pipeline with more than one worker that shuffles data will hang. Causes: Firewall rules associated with your network don't open TCP ports 12345-12346 for Dataflow instances. If a firewall rule opens connection in these ports, ensure target tags aren't specified, or that the rule includes the tag 'dataflow'.
INFO:root:2020-01-03T09:18:48.549Z: JOB_MESSAGE_DETAILED: Workers have started successfully.

Where's no error in the log tab on Dataflow webpage, but in stackdriver:

message: "Error syncing pod 6515c378c6bed37a2c0eec1fcfea300c ("<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)"), skipping: [failed to "StartContainer" for "sdk0" with CrashLoopBackOff: "Back-off 10s restarting failed container=sdk0 pod=<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)""
message: ", failed to "StartContainer" for "sdk1" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=sdk1 pod=<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)"" 
message: ", failed to "StartContainer" for "sdk2" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=sdk2 pod=<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)"" 
message: ", failed to "StartContainer" for "sdk3" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=sdk3 pod=<dataflow-id>--01030117-c9pc-harness-5lkv_default(6515c378c6bed37a2c0eec1fcfea300c)"" 

I find this error too (in info mode):

Collecting jsonschema (from -r /var/opt/google/staged/requirements.txt (line 1))
  Installing build dependencies: started
Looking in links: /var/opt/google/staged
  Installing build dependencies: started
Collecting jsonschema (from -r /var/opt/google/staged/requirements.txt (line 1))
  Installing build dependencies: started
Looking in links: /var/opt/google/staged
Collecting jsonschema (from -r /var/opt/google/staged/requirements.txt (line 1))
  Installing build dependencies: started
  Installing build dependencies: finished with status 'error'
  ERROR: Command errored out with exit status 1:
   command: /usr/local/bin/python3 /usr/local/lib/python3.7/site-packages/pip install --ignore-installed --no-user --prefix /tmp/pip-build-env-mdurhav9/overlay --no-warn-script-location --no-binary :none: --only-binary :none: --no-index --find-links /var/opt/google/staged -- 'setuptools>=40.6.0' wheel
       cwd: None
  Complete output (5 lines):
  Looking in links: /var/opt/google/staged
  Collecting setuptools>=40.6.0
  Collecting wheel
    ERROR: Could not find a version that satisfies the requirement wheel (from versions: none)
  ERROR: No matching distribution found for wheel

But I don't know why it can get this dependency...

Do you have any idea how I can debug this? or why I encounter this error?

Thanks

解决方案

When Dataflow workers start, they execute several steps:

  1. Install packages from requirements.txt
  2. Install packages specified as extra_packages
  3. Install workflow tarball and execute actions provided in setup.py.

Error syncing pod with CrashLoopBackOff message can be related to dependency conflict. You need to verify that there are no conflicts with the libraries and versions used for the job. Please refer to the documentation for staging required dependencies of the pipeline.

Also, take a look for preinstalled dependencies and this StackOverflow thread.

What you can try is change the version of jsonschema and try run it again. If it wouldn't help, please provide requirements.txt file.

I hope it will help you.

这篇关于为什么我会遇到“错误同步广告连播"与数据流管道?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆