是否可以使用数据流将 pubsub 消息删除重复的 pubsub 消息? [英] Deduplicate pubsub messages back to pubsub with dataflow possible?

查看：34 发布时间：2021/11/11 22:34:50 message-queue google-cloud-dataflow apache-beam google-cloud-pubsub

本文介绍了是否可以使用数据流将 pubsub 消息删除重复的 pubsub 消息?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个将数据写入 Google Cloud pubsub 的应用程序，根据 pubsub 的文档，由于重试机制而导致重复是偶尔发生的事情.还有就是pubsub也不能保证的乱序消息.

I have an application writing data to Google Cloud pubsub and as per the documentation of pubsub, duplicates due to retry mechanism is something that can happen once in a while. There is also the issue of out-of-order messages which is also not guaranteed in pubsub.

此外，根据文档，可以使用 Google Cloud Dataflow 对这些消息进行重复数据删除.

Also per documentation, it is possible to use Google Cloud Dataflow to deduplicate these messages.

我想让这些消息在消息队列(意味着云发布订阅)中可用，供服务使用，云数据流似乎有一个发布订阅者，但是你不会回到与写入完全相同的问题吗?pubsub 可以创建重复项吗?这难道不是与订单相同的问题吗?如何使用 pubsub(或任何其他系统)按顺序流式传输消息?

I want to make those messages available in a messaging queue (meaning cloud pubsub) for services to consume and cloud Dataflow does seem to have a pubsubio writer however wouldn't you be getting back to the exactly the same problem where writing to pubsub can create duplicates? Wouldn't that also be the same issue with order? How can I stream messages in order using pubsub (or any other system for that matter)?

是否可以使用云数据流从发布订阅主题读取并写入另一个发布订阅并保证没有重复?如果不是，您将如何实现支持相对少量数据的流式传输?

Is it possible to use cloud dataflow to read from a pubsub topic and write to another pubsub with guarantees of no duplicates? If not how else would you do this that supports streaming for a relatively small amount of data?

此外，我对 Apache 光束/云数据流非常陌生.这样一个简单的用例会是什么样子?我想我可以使用 pubsub 本身生成的 ID 进行重复数据删除，因为我让 pubsub 库进行其内部重试而不是自己进行重试，因此重试时 ID 应该相同.

Also I am very new to Apache beam/Cloud Dataflow. How would such a simple use case look like? I suppose I can deduplicate using the ID generated by pubsub itself, as I am letting the pubsub library do its internal retry rather than do it myself so the ID should be the same on retries.

是否可以使用数据流将 pubsub 消息删除重复的 pubsub 消息? [英] Deduplicate pubsub messages back to pubsub with dataflow possible?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

是否可以使用数据流将 pubsub 消息删除重复的 pubsub 消息? [英] Deduplicate pubsub messages back to pubsub with dataflow possible?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭