Apache Spark与Google PubSub的结构化流 [英] Apache Spark’s Structured Streaming with Google PubSub

查看:101
本文介绍了Apache Spark与Google PubSub的结构化流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Spark Dstream从Google PubSub中提取和处理数据.

I'm using Spark Dstream to pull and process data from Google PubSub.

我正在寻找一种转移到结构化流媒体的方法,但仍使用Pub/Sub.

I'm looking for a way to move to structured streaming but still using Pub/Sub.

另外,我应该提到我的消息是在Pub/Sub中经过Snappy压缩的.

Also, I should mention that my messages are Snappy compressed in Pub/Sub.

我发现了此问题,该问题声称将Pub/Sub与结构化一起使用不支持流式传输.

I found this issue which claims that using Pub/Sub with structured streaming is not supported.

有人遇到了这个问题吗?是否有可能实现自定义Receiver以从Pub/Sub

Is someone has encountered this problem? Is it possible to implement custom Receiver to read the data from Pub/Sub

谢谢

推荐答案

功能请求您所引用的信息仍然准确:Cloud Pub/Sub不具有用于跟踪您的读取位置的偏移量概念,因此不支持使用Cloud Pub/Sub进行结构化的流式传输.

The feature request you referenced is still accurate: Cloud Pub/Sub does not have the concept of an offset to track your read position, so structured streaming with Cloud Pub/Sub is not supported.

这篇关于Apache Spark与Google PubSub的结构化流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆