将 Kafka 输入流动态连接到多个输出流 [英] Dynamically connecting a Kafka input stream to multiple output streams

查看:38
本文介绍了将 Kafka 输入流动态连接到多个输出流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Kafka Streams 是否内置了允许将单个输入流动态连接到多个输出流的功能?KStream.branch 允许基于真/假谓词的分支,但这不是我想要的.我希望每个传入的日志都确定它将在运行时流式传输到的主题,例如,日志 {"date": "2017-01-01"} 将流式传输到主题 topic-2017-01-01 和日志 {"date": "2017-01-02"} 将流式传输到主题 topic-2017-01-02.

Is there functionality built into Kafka Streams that allows for dynamically connecting a single input stream into multiple output streams? KStream.branch allows branching based on true/false predicates, but this isn't quite what I want. I'd like each incoming log to determine the topic it will be streamed to at runtime, e.g., a log {"date": "2017-01-01"} will be streamed to the topic topic-2017-01-01 and a log {"date": "2017-01-02"} will be streamed to the topic topic-2017-01-02.

我可以在流上调用 forEach,然后写入 Kafka 生产者,但这似乎不太优雅.在 Streams 框架内有没有更好的方法来做到这一点?

I could call forEach on the stream, then write to a Kafka producer, but that doesn't seem very elegant. Is there a better way to do this within the Streams framework?

推荐答案

如果您想根据数据动态创建主题,目前 Kafka 的 Streaming API 中没有任何支持 (v0.10.2 及更早版本).您需要创建一个 KafkaProducer 并自己实现动态路由"(例如使用 KStream#foreach()KStream#process()代码>).请注意,您需要进行同步写入以避免数据丢失(不幸的是性能不是很好).有计划使用动态主题路由来扩展 Streaming API,但目前还没有具体的时间表.

If you want to create topics dynamically based on your data, you do not get any support within Kafka's Streaming API at the moment (v0.10.2 and earlier). You will need to create a KafkaProducer and implement your dynamic "routing" by yourself (for example using KStream#foreach() or KStream#process()). Note, that you need to do synchronous writes to avoid data loss (which are not very performant unfortunately). There are plans to extend Streaming API with dynamic topic routing, but there is no concrete timeline for this feature right now.

您还应该考虑另外一个因素.如果您事先不知道您的目标主题,而只是依赖所谓的主题自动创建"功能,您应该确保这些主题是使用所需的配置设置(例如,分区数)创建的或复制因子).

There is one more consideration you should take into account. If you do not know your destination topic(s) ahead of time and just rely on the so-called "topic auto creation" feature, you should make sure that those topics are being created with the desired configuration settings (e.g., number of partitions or replication factor).

作为主题自动创建"的替代方案,您还可以使用管理客户端(自 v0.10.1 起可用)来创建具有正确配置的主题.参见 https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations

As an alternative to "topic auto creation" you can also use Admin Client (available since v0.10.1) to create topics with correct configuration. See https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations

这篇关于将 Kafka 输入流动态连接到多个输出流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆