使用Google Cloud Dataflow PubSubIO,何时读取消息会得到确认? [英] Using google cloud dataflow PubSubIO, when does the read of the message get acknowledged?
问题描述
是否可以将确认延迟到成功处理子图(PubSubIO.Read以下的所有内容)之后?
Is it possible to delay acknowledgement until the subgraph (everything below the PubSubIO.Read) is successfully processed?
例如,我们要从Google pubsub订阅中流式传输 ,然后将文件写入GCS,而在另一个分支中,我们将使用BigQueryIO.Write ...来写入BigQuery
For example, we are streaming reads from a google pubsub subscription and then writing a file to GCS and in another branch we are writing to BigQuery using BigQueryIO.Write...
我们确实看到,如果发生异常,则由于我们处于流模式,因此它将无限期地重试.但是,如果我们取消作业并通过更改代码重新部署,则不会重新处理该消息.
We do see that if an exception occurs it will retry indefinitely, since we are in streaming mode. However, if we cancel the job and redeploy with a code change, the message is not reprocessed.
推荐答案
消息一旦持久保存在Dataflow管道中的某个位置,就会进行确认.如果要在不丢失运行中数据的情况下对管道进行更改,请使用更新"功能,而不要使用取消": https://cloud.google.com/dataflow/pipelines/updating-a-pipeline
The acknowledgement will be made once the message is durable persisted somewhere in the Dataflow pipeline. If you want to make changes to a pipeline without losing in-flight data, use the Update feature instead of Cancel: https://cloud.google.com/dataflow/pipelines/updating-a-pipeline
这篇关于使用Google Cloud Dataflow PubSubIO,何时读取消息会得到确认?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!