如何编写Kafka使用者 - 单线程与多线程 [英] How to write Kafka consumers - single threaded vs multi threaded

查看:148
本文介绍了如何编写Kafka使用者 - 单线程与多线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个Kafka消费者(使用Spring Kafka),它从一个主题中读取并且是消费者组的一部分。消息消耗后,它将执行所有下游操作并继续下一个消息偏移。我将其打包为WAR文件,我的部署管道将其推送到单个实例。使用我的部署管道,我可以将此工件部署到部署池中的多个实例。

I have written a single Kafka consumer (using Spring Kafka), that reads from a single topic and is a part of a consumer group. Once a message is consumed, it will perform all downstream operations and move on to the next message offset. I have packaged this as a WAR file and my deployment pipeline pushes this out to a single instance. Using my deployment pipeline, I could potentially deploy this artifact to multiple instances in my deployment pool.

但是,当我想要多个消费者时,我无法理解以下内容作为我的基础架构的一部分 -

However, I am not able to understand the following, when I want multiple consumers as part of my infrastructure -


  • 我实际上可以在部署池中定义多个实例,
    运行此WAR在所有这些情况下。这意味着,所有
    他们都在听同一个主题,是同一个消费者
    组的一部分,并且实际上会将分区分开。
    下游逻辑将按原样工作。这对于我的
    用例非常合适,但是,我不确定,如果这是
    的最佳方法?

  • I can actually define multiple instances in my deployment pool and have this WAR running on all those instances. This would mean, all of them are listening to the same topic, are a part of the same consumer group and will actually divide the partitions among themselves. The downstream logic will work as is. This works perfectly fine for my use case, however, I am not sure, if this is the optimal approach to follow ?

在线阅读,我遇到了资源这里这里
,其中人们定义一个消费者线程,但在内部,
创建多个工作线程。还有一些例子,我们
可以定义执行下游逻辑的多个消费者线程。
考虑到这些方法并将它们映射到部署
环境,我们可以获得相同的结果(就像上面的理论
解决方案一样),但机器数量较少。

Reading online, I came across resources here and here, where people are defining a single consumer thread, but internally, creating multiple worker threads. There are also examples where we could define multiple consumer threads that do the downstream logic. Thinking about these approaches and mapping them to deployment environments, we could achieve the same result (as my theoretical solution above could), but with less number of machines.

就我个人而言,我认为我的解决方案简单,可扩展,但可能不是最优的,而第二种方法可能是最佳的,但想知道您应该考虑的经验,建议或任何其他指标/约束?此外,我正在考虑我的理论解决方案,我实际上可以使用简单的机器作为Kafka消费者。

Personally, I think my solution is simple, scalable but might not be optimal, while the second approach might be optimal, but wanted to know your experiences, suggestions or any other metrics / constraints I should consider ? Also, I am thinking with my theoretical solution, I could actually employ bare bones simple machines as Kafka consumers.

虽然我知道,我还没有发布任何代码,请如果我需要将此问题转移到另一个论坛,请告诉我。如果你需要特定的代码示例,我也可以提供它们,但我不认为它们很重要,在我的问题中。

While I know, I haven’t posted any code, please let me know if I need to move this question to another forum. If you need specific code examples, I can provide them too, but I didn’t think they are important, in the context of my question.

推荐答案

您现有的解决方案是最好的。切换到另一个线程将导致偏移管理问题。 Spring kafka允许您在每个实例中运行多个线程,只要您有足够的分区。

Your existing solution is best. Handing off to another thread will cause problems with offset management. Spring kafka allows you to run multiple threads in each instance, as long as you have enough partitions.

这篇关于如何编写Kafka使用者 - 单线程与多线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆