消费者组可以跨越集群中的不同节点吗? [英] Can consumer groups span different nodes in a cluster?

查看:98
本文介绍了消费者组可以跨越集群中的不同节点吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经看到了许多示例,这些示例使用高级消费者(消费者组)在同一进程中使用多个线程来消费一个主题.您是否可以有多个进程(在不同的计算机上)拆分分区并并行使用?如果是这样,您有什么例子吗?

I've seen a lot of examples of using the high level consumer (consumer group) to consume a topic using many threads within the same process. Can you have multiple processes (on different machines) split the partitions and consume in parallel? If so, do you have any examples?

推荐答案

简短的回答是.对于高级使用者,每个线程处理一个或多个分区,并且使用zookeeper进行协调.由于使用了zookeeper,因此可以将它们分散在单独的过程和机器上.Kafka Wiki上有一个使用高级消费者的示例.您可以在多台计算机上运行它,以查看其运行情况.当添加或删除消费者时,高级消费者将自动在各个消费者之间重新平衡.请记住,分区定义了主题的并行性级别,因此,如果您拥有比分区更多的使用者线程,那么其中一些线程将处于空闲状态.

The short answer is yes. With the high-level consumer, each thread handles one or more partitions and zookeeper is used to coordinate. Since zookeeper is used, its fine to spread them out across separate processes and machines. The Kafka wiki has an example using the high-level consumer. You can run that on multiple machines to see it in action. The high-level consumer will automatically rebalance across consumers when they are added or removed. Remember that partitions define the level of parallelism for a topic so if you have more consumer threads than partitions, some of those threads will just sit idle.

还值得注意的是,Kafka没有提供任何类型的分布式框架来跨机器运行消费者应用程序.那就是诸如Storm或Spark之类的系统有用的地方,因为它们可以从Kafka进行消费,并管理进行消费的流程.Kafka背后的人们最近还开源了一个名为 Samza 的软件包,该软件包可在Hadoop/YARN.

It's also worth noting that Kafka does not provide any sort of distributed framework for running the consumer applications across machines. That's where systems like Storm or Spark are useful since they can consume from Kafka and manage the processes doing the consuming. The folks behind Kafka also recently open sourced a package called Samza which provides higher-level kafka-based stream processing on Hadoop/YARN.

这篇关于消费者组可以跨越集群中的不同节点吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆