有数据流源确认API吗? [英] Is there a Dataflow source ack API?

查看:70
本文介绍了有数据流源确认API吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据关闭并使用PubSubIO +消息保证在Google Dataflow中更新作业,数据流的发布/订阅源在可靠地持久存储消息之前不会确认消息.是否有可能对此进行手动控制?由于当前没有无限制的自定义接收器支持,我们在ParDo中坚持将行作为副作用,是否有任何方法可以将ParDo标记为在处理这些记录时成功处理了捆绑包"?

As per shutdown and update job in Google Dataflow with PubSubIO + message guarantees the pub/sub source for dataflow does not ack messages until they have been reliably persisted. Is there any possibility for manual control over this? We're persisting rows as a side-effect in a ParDo as there is currently no unbounded custom sink support, is there any way for us to mark that ParDo as "on bundle processing success ack these records"?

或者,如果ParDo失败,我们是否可以将其作为副作用继续存在,然后抛出异常,然后在管道中的ParDo中使用某种虚拟"流接收器,例如BigQuery,以确保消息被确认会吗?作为正常的预期行为"的一部分抛出异常会导致新的问题吗?

Alternatively, could we persist as a side-effect in a ParDo, if it fails throw an exception, and then after that ParDo in the pipeline have some sort of "dummy" streaming sink like BigQuery to make sure the messages are ack'd? Would throwing exceptions as part of "normal, expected behaviour" lead to new problems?

这里的答案真的是只是等待无限制的自定义接收器支持"吗?

Is the answer here really "just wait for unbounded custom sink support"?

推荐答案

我相信Dataflow会自动提供您想要的行为.在用您的ParDo处理完并保留结果之前,我们不会确认PubSub消息.

I believe Dataflow automatically gives the behavior you want. We will not ack PubSub messages until we have finished processing them with your ParDo's and persisted the results.

这篇关于有数据流源确认API吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆