数据流流作业无法扩展到超过1个工作人员 [英] Dataflow streaming job not scaleing past 1 worker

查看:75
本文介绍了数据流流作业无法扩展到超过1个工作人员的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Apache Beam SDK for Java 2.1.0的我的流数据流作业(2017-09-08_03_55_43-9675407418829265662)不能扩展到超过1个Worker,即使pubsub队列正在增加(现在为100k未送达消息)–您是否知道为什么?

My streaming dataflow job(2017-09-08_03_55_43-9675407418829265662) using Apache Beam SDK for Java 2.1.0 will not scale past 1 Worker even with a growing pubsub queue (now 100k Undelivered messages) – do you have any ideas why?

当前与autoscalingAlgorithm=THROUGHPUT_BASEDmaxNumWorkers=10一起运行.

推荐答案

此处是数据流工程师.我在后端查看了这份工作,发现它没有扩大规模,因为CPU利用率低,这意味着其他一些因素限制了管道的性能,例如外部节流.在这些情况下,升级很少有帮助.

Dataflow Engineer here. I looked up the job in our backend and I can see that it is not scaling up because CPU utilization is low, meaning something else is limiting the performance of the pipeline, such as external throttling. Upscaling rarely helps in these cases.

我看到某些捆绑包可能要花几个小时才能处理.我建议调查您的管道逻辑,看看是否还有其他可以优化的部分.

I see that some bundles are taking up to hours to process. I recommend investigating your pipeline logic and see if there are other parts that can be optimized.

这篇关于数据流流作业无法扩展到超过1个工作人员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆