为什么 Dataflow 步骤未启动? [英] Why do Dataflow steps not start?

查看:27
本文介绍了为什么 Dataflow 步骤未启动?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个线性的三步数据流管道 - 由于某种原因,最后一步开始了,但在我放弃并终止这项工作之前,前两步长时间处于 Not started 状态.我不确定是什么原因造成的,因为这个相同的管道在过去已经成功运行,而且我很惊讶它没有在日志中显示任何关于阻止前两个步骤开始的错误.什么会导致这种情况,我该如何防止这种情况发生?

I have a linear three step Dataflow pipeline - for some reason the last step started, but the preceding two steps hung in Not started for a long time before I gave up and killed the job. I'm not sure what caused this, as this same pipeline had successfully run in the past, and I'm surprised it didn't show any errors in the logs as to what was preventing the first two steps from starting. What can cause such a situation and how can I prevent it from happening?

推荐答案

发生这种情况是因为工作器启动时出错.某些 Dataflow 步骤似乎不需要工作人员(例如写入 GCS),这就是该步骤能够启动的原因 - 即该步骤启动并不意味着工作人员正在被正确创建.默认情况下,工作日志中不显示工作器启动 - 您需要单击工作日志中的 Stackdriver 链接,然后在日志下拉列表中添加 worker-startup以便查看任何这些错误.

This was happening because of an error in the worker start up. Certain Dataflow steps do not seem to require workers (e.g. writing to GCS), which is why that step was able to start - i.e. that step starting does not imply that workers are being created correctly. Worker start up is not displayed in the job logs by default - you need to click the link to Stackdriver in the job logs and then add worker-startup in the logs drop down in order to see any of those errors.

这篇关于为什么 Dataflow 步骤未启动?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆