在哪个阶段,Dataflow/Apache Beam会确认发布/订阅消息? [英] At what stage does Dataflow/Apache Beam ack a pub/sub message?

查看:86
本文介绍了在哪个阶段,Dataflow/Apache Beam会确认发布/订阅消息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用Pub/Sub订阅作为无限制源的数据流流作业.我想知道数据流在哪个阶段终止传入的pub/sub消息.在我看来,如果在数据流管道的任何阶段都抛出异常,则消息丢失.

I have a dataflow streaming job with Pub/Sub subscription as an unbounded source. I want to know at what stage does dataflow acks the incoming pub/sub message. It appears to me that the message is lost if an exception is thrown during any stage of the dataflow pipeline.

我还想了解如何使用发布/订阅无限制源编写数据流管道的最佳实践,以便在失败时进行消息检索.谢谢!

Also I'd like to know how to the best practices for writing dataflow pipeline with pub/sub unbounded source for message retrieval on failure. Thank you!

推荐答案

在捆绑成功并且捆绑的结果(输出和状态突变等)已被持久提交之后,数据流流运行程序将解析由捆绑接收的pubsub消息.失败的包将重试,直到它们成功为止,并且不会造成数据丢失.如果您认为可能会发生数据丢失,请提供详细信息(作业ID和您的推理,使您得出结论是由于故障导致数据已丢失),我们将进行调查.

The Dataflow Streaming Runner acks pubsub messages received by a bundle after the bundle has succeeded and results of the bundle (outputs and state mutations etc) have been durably committed. Failed bundles are retried until they succeed, and don't cause data loss. If you believe that data loss may be happening, please include details (job id and your reasoning that lead you to conclude that data has been dropped because of the failures) and we'll investigate.

这篇关于在哪个阶段,Dataflow/Apache Beam会确认发布/订阅消息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆