Kafka 流过滤:经纪人还是消费者? [英] Kafka streams filtering: broker or consumer side?

查看:30
本文介绍了Kafka 流过滤:经纪人还是消费者?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究 kafka 流.我想使用选择性非常低的过滤器(几千分之一)过滤我的流.我在看这个方法:https://kafka.apache.org/0100/javadoc/org/apache/kafka/streams/kstream/KStream.html#filter(org.apache.kafka.streams.kstream.Predicate)

I am looking into kafka streams. I want to filter my stream, using a filter with very low selectivity (one in few thousands). I was looking at this method: https://kafka.apache.org/0100/javadoc/org/apache/kafka/streams/kstream/KStream.html#filter(org.apache.kafka.streams.kstream.Predicate)

但是我找不到任何证据,如果过滤器会被消费者评估(我真的不想将大量GB转移给消费者,只是为了扔掉它们),或者在经纪人内部(耶!).

But I can't find any evidence, if the filter will be evaluated by consumer (I really do not want to transfer a lot of GB to consumer, just to throw them away), or inside the broker (yay!).

如果在消费者方面进行评估,有什么办法,如何在经纪人中做到这一点?

If its evaluated on consumer side, is there any way, how to do this in broker?

谢谢!

推荐答案

Kafka 不支持代理端过滤.如果您使用 Streams API,过滤将在您的应用程序中完成(谓词不会由 KafkaConsumer 评估,而是在您的拓扑的处理器节点"内——即在 Streams API 运行时代码内).

Kafka does not support broker side filtering. If you use Streams API, filtering will be done in your application (the predicate will not be evaluated by KafkaConsumer but within a "processor node" of your topology -- ie, within Streams API runtime code).

这可能有帮助:https://docs.confluent.io/current/streams/架构.html

不支持代理端过滤的原因是,代理仅使用 (1) 字节数组作为键和值数据类型,并使用 (2) 零复制机制来实现高吞吐量.需要代理端过滤,以便在代理端反序列化数据,这会对性能造成重大影响(反序列化成本和无零拷贝优化).

The reason for not supporting broker side filtering is, that brokers only use (1) byte arrays as key and value data types and use (2) zero-copy mechanism to achieve high throughput. Broker side filtering would required, to deserialize the data at the broker side what would be a major performance hit (deserialization cost and no zero-copy optimization).

这篇关于Kafka 流过滤:经纪人还是消费者?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆