用于Azure Blob存储的Kafka连接器 [英] Kafka Connector for Azure Blob Storage

查看:83
本文介绍了用于Azure Blob存储的Kafka连接器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将推送到Kafka的消息存储在深度存储中.我们正在使用Azure云服务,因此我想Azure Blob存储可能是一个更好的选择.我想使用Kafka Connect的接收器连接器API将数据推送到Azure Blob. Kafka文档主要建议HDFS导出数据,但是,在那种情况下,我需要运行Hadoop的Linux VM,我想这会很昂贵.我的问题是Azure Blob存储是存储JSON对象的合适选择,而构建自定义接收器连接器是这种情况的合理解决方案吗?

I need to store the messages pushed to Kafka in a deep storage. We are using Azure cloud services so I suppose Azure Blob storage could be a better option. I want to use Kafka Connect's sink connector API to push data to Azure Blob. Kafka documentation mostly suggests HDFS to export data however, in that case I need a Linux VM running Hadoop that will be costly I guess. My question is Azure Blob storage an appropriate choice to store JSON objects and building a custom sink connector is a reasonable solution for this case?

推荐答案

自定义接收器连接器肯定有效. Kafka Connect是绝对设计的,因此您可以插入连接器.实际上,连接器的开发完全是联邦的.仅仅由于这两个用例的普及,首先实现了Confluent的JDBC和HDFS连接器,但是还有更多(我们保留了我们知道的连接器列表此处.

A custom sink connector definitely works. Kafka Connect was absolutely designed so you could plugin connectors. In fact, connector development is entirely federated. Confluent's JDBC and HDFS connectors were implemented first simply due to the popularity of those two use cases, but there are many more (we keep a list of connectors we're aware of here.

关于Azure blob存储是否合适,您提到了JSON对象.我认为您唯一要考虑的是对象的大小以及Azure存储是否将处理大小对象数量很好.我不确定Azure存储的特性,但是在许多其他对象存储系统中,您可能需要将许多对象聚合到一个Blob中,才能对大量对象获得良好的性能(即,您可能需要支持许多JSON对象的文件格式).

In terms of whether Azure blob storage is appropriate, you mention JSON objects. I think the only thing you'll want to consider is the size of the objects and whether Azure storage will handle the size & number of objects well. I am not sure about Azure storage's characteristics, but in many other object storage systems you might need to aggregate many objects into a single blob to get good performance for a large number of objects (i.e. you might need a file format that supports many JSON objects).

这篇关于用于Azure Blob存储的Kafka连接器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆