消费者组可以跨越集群中的不同节点吗? [英] Can consumer groups span different nodes in a cluster?

查看:26
本文介绍了消费者组可以跨越集群中的不同节点吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我见过很多使用高级消费者(消费者组)在同一进程中使用多个线程来消费主题的示例.你能有多个进程(在不同的机器上)分割分区并并行消费吗?如果是这样,你有什么例子吗?

I've seen a lot of examples of using the high level consumer (consumer group) to consume a topic using many threads within the same process. Can you have multiple processes (on different machines) split the partitions and consume in parallel? If so, do you have any examples?

推荐答案

简短的回答是肯定的.使用高级消费者,每个线程处理一个或多个分区,使用zookeeper进行协调.由于使用了zookeeper,因此可以将它们分散到不同的进程和机器中.Kafka wiki 有一个使用高级消费者的示例.您可以在多台机器上运行它以查看它的运行情况.高级消费者将在添加或删除消费者时自动重新平衡消费者.请记住,分区定义了主题的并行度级别,因此如果您的使用者线程多于分区,其中一些线程将处于空闲状态.

The short answer is yes. With the high-level consumer, each thread handles one or more partitions and zookeeper is used to coordinate. Since zookeeper is used, its fine to spread them out across separate processes and machines. The Kafka wiki has an example using the high-level consumer. You can run that on multiple machines to see it in action. The high-level consumer will automatically rebalance across consumers when they are added or removed. Remember that partitions define the level of parallelism for a topic so if you have more consumer threads than partitions, some of those threads will just sit idle.

还值得注意的是,Kafka 没有提供任何类型的分布式框架来跨机器运行消费者应用程序.这就是 Storm 或 Spark 等系统的用处,因为它们可以从 Kafka 消费并管理消费进程.Kafka 背后的人最近还开源了一个名为 Samza 的包,它提供了更高级别的基于 kafka 的流处理Hadoop/YARN.

It's also worth noting that Kafka does not provide any sort of distributed framework for running the consumer applications across machines. That's where systems like Storm or Spark are useful since they can consume from Kafka and manage the processes doing the consuming. The folks behind Kafka also recently open sourced a package called Samza which provides higher-level kafka-based stream processing on Hadoop/YARN.

这篇关于消费者组可以跨越集群中的不同节点吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆