将Kafka输入流动态连接到多个输出流 [英] Dynamically connecting a Kafka input stream to multiple output streams

查看:129
本文介绍了将Kafka输入流动态连接到多个输出流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Kafka Streams内置的功能是否允许将单个输入流动态连接到多个输出流? KStream.branch 允许基于true / false谓词的分支,但这不是我想要的。我希望每个传入日志确定它将在运行时流式传输到的主题,例如,日志 {date:2017-01-01} 将流式传输到主题主题-2017-01-01 和日志 {date:2017-01-02} 将流式传输到主题主题-2017-01-02

Is there functionality built into Kafka Streams that allows for dynamically connecting a single input stream into multiple output streams? KStream.branch allows branching based on true/false predicates, but this isn't quite what I want. I'd like each incoming log to determine the topic it will be streamed to at runtime, e.g., a log {"date": "2017-01-01"} will be streamed to the topic topic-2017-01-01 and a log {"date": "2017-01-02"} will be streamed to the topic topic-2017-01-02.

我可以致电<流上的code> forEach ,然后写入Kafka制作人,但这似乎不是很优雅。在Streams框架中有更好的方法吗?

I could call forEach on the stream, then write to a Kafka producer, but that doesn't seem very elegant. Is there a better way to do this within the Streams framework?

推荐答案

如果您想根据数据动态创建主题,目前你在Kafka的Streaming API中没有获得任何支持( v0.10.2 及更早版本)。您需要创建一个 KafkaProducer 并自己实现动态路由(例如使用 KStream#foreach() KStream #process())。请注意,您需要执行同步写入以避免数据丢失(不幸的是,这不是非常高性能)。有计划使用动态主题路由扩展Streaming API,但目前此功能没有具体的时间表。

If you want to create topics dynamically based on your data, you do not get any support within Kafka's Streaming API at the moment (v0.10.2 and earlier). You will need to create a KafkaProducer and implement your dynamic "routing" by yourself (for example using KStream#foreach() or KStream#process()). Note, that you need to do synchronous writes to avoid data loss (which are not very performant unfortunately). There are plans to extend Streaming API with dynamic topic routing, but there is no concrete timeline for this feature right now.

还有一个需要考虑的因素。如果您不提前知道目标主题并且只依赖于所谓的主题自动创建功能,则应确保使用所需的配置设置创建这些主题(例如,分区数量)或者复制因子)。

There is one more consideration you should take into account. If you do not know your destination topic(s) ahead of time and just rely on the so-called "topic auto creation" feature, you should make sure that those topics are being created with the desired configuration settings (e.g., number of partitions or replication factor).

作为主题自动创建的替代方法,您还可以使用Admin Client(自 v0.10.1 )创建配置正确的主题。请参阅 https:/ /cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations

As an alternative to "topic auto creation" you can also use Admin Client (available since v0.10.1) to create topics with correct configuration. See https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations

这篇关于将Kafka输入流动态连接到多个输出流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆