Kafka HDFS 连接器 - 没有完全融合 [英] Kafka HDFS Connector - Without Full Confluent

查看:22
本文介绍了Kafka HDFS 连接器 - 没有完全融合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个正在运行的 Kafka 0.10 实例,我目前正在使用 Gobblin 将数据存储到 HDFS 中.我想切换到 Kafka Connect,经过研究,我发现 Confluent 提供了一个连接器.

I have a running instance of Kafka 0.10 and I'm currently using Gobblin to store data into HDFS. I want to switch to Kafka Connect, and as I researched I found that Confluent provide a connector.

但是,有没有办法在不使用整个 Confluent 平台的情况下使用这个连接器?例如,我可以从 Confluent 源复制相关脚本并以某种方式让我的 Kafka 实例使用它吗?我基本上还在学习这些东西,所以我还不是很精通这个领域.

However, is there a way to use this connector without using the entire Confluent platform? Meaning can I for example copy the relevant scripts from Confluent source and somehow make my Kafka instance use it? I'm basically still learning my way through this stuff so I'm not yet very well versed in this space.

谢谢.

推荐答案

是的,这是可能的.我已经这样做了.我使用在 Docker 容器中运行的稍微修改的 Confluent HDFS 独立连接器.但是,您也必须使用 SchemaRegistry.因为连接器与 SchemaRegistry 紧密耦合.此外,您必须发送具有特殊格式的消息.为了支持自动模式识别 Confluent Kafka 消费者引入了消息的内部格式.因此,为了与融合的消费者兼容,您的生产者必须按照以下格式撰写消息.

Yes it is possible. I've done that. I use slightly modified Confluent HDFS standalone connector that runs in Docker container. However, you will have to use SchemaRegistry too. Because connectors are tightly coupled to SchemaRegistry. Also, you will have to send messages with special format. To support automatic schema recognition Confluent Kafka consumers introduce an internal format of messages. Therefore, to be compatible with confluent consumers, your producers must compose messages according to the following format.

  • 标题(5 个字节)
    • 消息魔术字节"的第一个字节应始终为 0
    • 接下来的 4 个字节应该是架构注册表中以 Big Endian 格式编码的架构 ID.

    PS 向主题发送消息时要非常小心,因为如果消息与架构不匹配,或者注册表中不存在具有 Id 的架构,消费者会默默地失败:工作线程停止但应用程序仍然挂在内存中并且不会退出.

    PS Be very careful with sending messages to topic becuase if message does not match schema, or a schema with Id does not exist in registry, consumer silently fails: worker thread stops but applications still hangs in memory and does not exit.

    这篇关于Kafka HDFS 连接器 - 没有完全融合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆