如何编写 Kafka 消费者——单线程 vs 多线程 [英] How to write Kafka consumers - single threaded vs multi threaded

查看:35
本文介绍了如何编写 Kafka 消费者——单线程 vs 多线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个 Kafka 消费者(使用 Spring Kafka),它从一个主题中读取数据并且是消费者组的一部分.一旦消息被消费,它将执行所有下游操作并移动到下一个消息偏移量.我已将其打包为 WAR 文件,并且我的部署管道将其推送到单个实例.使用我的部署管道,我可以潜在地将此工件部署到我的部署池中的多个实例.

I have written a single Kafka consumer (using Spring Kafka), that reads from a single topic and is a part of a consumer group. Once a message is consumed, it will perform all downstream operations and move on to the next message offset. I have packaged this as a WAR file and my deployment pipeline pushes this out to a single instance. Using my deployment pipeline, I could potentially deploy this artifact to multiple instances in my deployment pool.

但是,当我希望将多个消费者作为我的基础架构的一部分时,我无法理解以下内容 -

However, I am not able to understand the following, when I want multiple consumers as part of my infrastructure -

  • 我实际上可以在我的部署池中定义多个实例并且在所有这些实例上运行此 WAR.这意味着,所有他们在听同一个话题,是同一个消费者的一部分group 并将实际在它们之间划分分区.这下游逻辑将按原样工作.这对我来说非常有效用例,但是,我不确定这是否是最佳方法关注?

  • I can actually define multiple instances in my deployment pool and have this WAR running on all those instances. This would mean, all of them are listening to the same topic, are a part of the same consumer group and will actually divide the partitions among themselves. The downstream logic will work as is. This works perfectly fine for my use case, however, I am not sure, if this is the optimal approach to follow ?

在线阅读,我遇到了资源here这里,人们定义一个单一的消费者线程,但在内部,创建多个工作线程.还有一些例子,我们可以定义多个执行下游逻辑的消费者线程.考虑这些方法并将它们映射到部署环境,我们可以达到相同的结果(正如我的理论上述解决方案可以),但机器数量较少.

Reading online, I came across resources here and here, where people are defining a single consumer thread, but internally, creating multiple worker threads. There are also examples where we could define multiple consumer threads that do the downstream logic. Thinking about these approaches and mapping them to deployment environments, we could achieve the same result (as my theoretical solution above could), but with less number of machines.

就我个人而言,我认为我的解决方案简单、可扩展但可能不是最佳的,而第二种方法可能是最佳的,但想知道您的经验、建议或我应该考虑的任何其他指标/限制?此外,我正在考虑我的理论解决方案,我实际上可以使用简单的机器作为 Kafka 消费者.

Personally, I think my solution is simple, scalable but might not be optimal, while the second approach might be optimal, but wanted to know your experiences, suggestions or any other metrics / constraints I should consider ? Also, I am thinking with my theoretical solution, I could actually employ bare bones simple machines as Kafka consumers.

虽然我知道,我还没有发布任何代码,如果我需要将此问题移至另一个论坛,请告诉我.如果您需要特定的代码示例,我也可以提供它们,但在我的问题中,我认为它们并不重要.

While I know, I haven’t posted any code, please let me know if I need to move this question to another forum. If you need specific code examples, I can provide them too, but I didn’t think they are important, in the context of my question.

推荐答案

您现有的解决方案是最好的.移交给另一个线程会导致偏移管理出现问题.Spring kafka 允许你在每个实例中运行多个线程,只要你有足够的分区.

Your existing solution is best. Handing off to another thread will cause problems with offset management. Spring kafka allows you to run multiple threads in each instance, as long as you have enough partitions.

这篇关于如何编写 Kafka 消费者——单线程 vs 多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆