Kafka流过滤:经纪人还是消费者方面? [英] Kafka streams filtering: broker or consumer side?

查看:77
本文介绍了Kafka流过滤:经纪人还是消费者方面?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究卡夫卡流.我想使用选择性极低(千分之一)的过滤器过滤流.我当时在看这种方法: https://kafka.apache.org/0100/javadoc/org/apache/kafka/streams/kstream/KStream.html#filter(org.apache.kafka.streams.kstream.Predicate)

I am looking into kafka streams. I want to filter my stream, using a filter with very low selectivity (one in few thousands). I was looking at this method: https://kafka.apache.org/0100/javadoc/org/apache/kafka/streams/kstream/KStream.html#filter(org.apache.kafka.streams.kstream.Predicate)

但是我找不到任何证据,如果过滤器将由消费者评估(我真的不想将大量GB转让给消费者,只是将其丢弃),或者在经纪人内部(是的!)

But I can't find any evidence, if the filter will be evaluated by consumer (I really do not want to transfer a lot of GB to consumer, just to throw them away), or inside the broker (yay!).

如果从消费者角度对其进行评估,那么有什么办法可以在经纪人中做到这一点?

If its evaluated on consumer side, is there any way, how to do this in broker?

谢谢!

推荐答案

Kafka不支持代理方过滤.如果您使用Streams API,则过滤将在您的应用程序中完成(该谓词将不被KafkaConsumer评估,而是在拓扑的处理器节点"内(即,在Streams API运行时代码内)进行评估.

Kafka does not support broker side filtering. If you use Streams API, filtering will be done in your application (the predicate will not be evaluated by KafkaConsumer but within a "processor node" of your topology -- ie, within Streams API runtime code).

这可能会有所帮助: https://docs.confluent.io/current/streams/architecture.html

不支持代理方筛选的原因是,代理仅使用(1)字节数组作为键和值数据类型,并使用(2)零复制机制来实现高吞吐量.需要经纪人端过滤,才能在经纪人端反序列化数据,这将对性能造成重大影响(反序列化成本和无零复制优化).

The reason for not supporting broker side filtering is, that brokers only use (1) byte arrays as key and value data types and use (2) zero-copy mechanism to achieve high throughput. Broker side filtering would required, to deserialize the data at the broker side what would be a major performance hit (deserialization cost and no zero-copy optimization).

这篇关于Kafka流过滤:经纪人还是消费者方面?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆