对kafka消息实施文件过滤 [英] Implement filering for kafka messages

查看:302
本文介绍了对kafka消息实施文件过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近开始使用Kafka,并在少数用例中对Kafka进行了评估.

I have started using Kafka recently and evaluating Kafka for few use cases.

如果我们想提供基于消息内容为消费者(订户)过滤消息的功能,什么是最好的方法?

If we wanted to provide the capability for filtering messages for consumers (subscribers) based on message content, what is best approach for doing this?

说生产者暴露了一个名为交易"的主题,该主题具有不同的交易详细信息,例如市场名称,创建日期,价格等.

Say a topic named "Trades" is exposed by producer which has different trades details such as market name, creation date, price etc.

一些消费者对特定市场的交易感兴趣,而另一些消费者对特定日期等之后的交易感兴趣.(基于内容的过滤)

Some consumers are interested in trades for a specific markets and others are interested in trades after certain date etc. (content based filtering)

由于经纪人方面无法进行过滤,因此实现以下情况的最佳方法是什么:

As filtering is not possible on broker side, what is best possible approach for implementing below cases :

  1. 如果过滤条件特定于消费者.我们应该使用 使用者拦截器(尽管建议使用拦截器进行日志记录 根据文档的目的)?
  2. 如果消费者之间普遍使用过滤标准(基于内容的过滤),应该采取什么方法?
  1. If filtering criteria is specific to consumer. Should we use Consumer-Interceptor (though interceptor are suggested for logging purpose as per documentation)?
  2. If filtering criteria (content based filtering) is common among consumers, what should be the approach?

收听主题并在本地过滤消息并写入新主题(使用拦截器或流)

Listen to topic and filter the messages locally and write to new topic (using either interceptor or streams)

推荐答案

如果我理解您的问题正确,那么您有一个主题和其他使用者,对该主题的特定部分感兴趣.同时,您不拥有这些使用者,而是要避免这些使用者只是阅读整个主题并自行进行过滤?

If I understand you question correctly, you have one topic and different consumer which are interested in specific parts of the topic. At the same time, you do not own those consumer and want to avoid that those consumer just read the whole topic and do the filtering by themselves?

为此,构建新应用程序的唯一方法是读取整个主题,进行过滤(或实际拆分)并将数据写回到两个(多个)不同的主题中.外部消费者会从这些新主题中进行消费,并且只会收到他们感兴趣的日期.

For this, the only way to go it to build a new application, that does read the whole topic, does the filtering (or actually splitting) and write the data back into two (multiple) different topics. The external consumer would consumer from those new topics and only receive the date they are interested in.

为此目的,使用Kafka Streams将是一个很好的方法. DSL应该可以提供您所需的一切.

Using Kafka Streams for this purpose would be a very good way to go. The DSL should offer everything you need.

作为替代方案,您可以仅使用KafkaConsumerKafkaProducer编写自己的应用程序,以在用户代码中手动进行过滤/拆分.这与使用Kafka Streams并没有太大区别,因为Kafka Streams应用程序将在内部执行完全相同的操作.但是,使用Streams,您可以省去很多麻烦.

As an alternative, you can just write your own application using KafkaConsumer and KafkaProducer to do the filtering/splitting manually in your user code. This would not be much different from using Kafka Streams, as a Kafka Streams application would do the exact same thing internally. However, with Streams your effort to get it done would be way less.

我不会为此使用拦截器.即使这行得通,对于您的用例来说,似乎也不是一个好的软件设计.

I would not use interceptors for this. Even is this would work, it seems not to be a good software design for you use case.

这篇关于对kafka消息实施文件过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆