Kafka 作为未来事件的数据存储 [英] Kafka as a data store for future events

查看:20
本文介绍了Kafka 作为未来事件的数据存储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 Kafka 集群,它根据源中的数据更改从源接收消息.在某些情况下,这些消息打算在将来进行处理.所以我有两个选择:

I have a Kafka cluster which receives messages from a source based on data changes in that source. In some cases the messages are meant to be processed in the future. So I have 2 options:

  1. 使用所有消息,并将将来用于不同主题的消息发布回 Kafka(主题名称中包含日期),并使用 Storm 拓扑查找包含该日期名称的主题.这将确保消息仅在其预定日期得到处理.
  2. 将其存储在单独的数据库中,并构建一个调度程序,该程序仅在该未来日期读取消息并发布到 Kafka.

选项 1 更容易执行,但我的问题是:Kafka 是一种持久的数据存储吗?有没有人用卡夫卡做过这种活动?设计中是否有任何漏洞?

Option 1 is easier to execute but my question is: Is Kafka a durable data store? And has anyone done this sort of eventing with Kafka? Are there any gaping holes in the design?

推荐答案

您可以配置消息在 Kafka 中停留的时间(log.retention.hours).

You can configure the amount of time your messages stay in Kafka (log.retention.hours).

但请记住,Kafka 旨在用作生产者和消费者之间的实时缓冲区",而不是用作持久的数据存储.我认为 Kafka+Storm 不是适合您用例的工具.为什么不直接将消息写入某个分布式文件系统,然后安排一个作业(MapReduce、Spark...)来处理这些事件?

But keep in mind that Kafka is meant to be used as a "real-time buffer" between your producers and your consumers, not as durable data store. I don't think Kafka+Storm would be the appropriate tool for your use case. Why not just write your messages in some distributed file system, and schedule a job (MapReduce, Spark...) to process those events?

这篇关于Kafka 作为未来事件的数据存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆