Dataflow/Apache Beam 在哪个阶段确认发布/订阅消息? [英] At what stage does Dataflow/Apache Beam ack a pub/sub message?

查看:35
本文介绍了Dataflow/Apache Beam 在哪个阶段确认发布/订阅消息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据流流作业,其中 Pub/Sub 订阅作为无限源.我想知道数据流在哪个阶段确认传入的发布/订阅消息.在我看来,如果在数据流管道的任何阶段抛出异常,消息就会丢失.

I have a dataflow streaming job with Pub/Sub subscription as an unbounded source. I want to know at what stage does dataflow acks the incoming pub/sub message. It appears to me that the message is lost if an exception is thrown during any stage of the dataflow pipeline.

另外,我想知道如何使用发布/订阅无界源编写数据流管道以在失败时进行消息检索的最佳实践.谢谢!

Also I'd like to know how to the best practices for writing dataflow pipeline with pub/sub unbounded source for message retrieval on failure. Thank you!

推荐答案

在捆绑包成功并持久提交捆绑包的结果(输出和状态更改等)后,Dataflow Streaming Runner 确认该捆绑包收到的 pubsub 消息.失败的包会重试直到成功,并且不会导致数据丢失.如果您认为可能会发生数据丢失,请提供详细信息(作业 ID 以及导致您得出数据因故障而丢失的结论),我们将展开调查.

The Dataflow Streaming Runner acks pubsub messages received by a bundle after the bundle has succeeded and results of the bundle (outputs and state mutations etc) have been durably committed. Failed bundles are retried until they succeed, and don't cause data loss. If you believe that data loss may be happening, please include details (job id and your reasoning that lead you to conclude that data has been dropped because of the failures) and we'll investigate.

这篇关于Dataflow/Apache Beam 在哪个阶段确认发布/订阅消息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆