为什么Kafka是基于拉的而不是基于推的? [英] Why is Kafka pull-based instead of push-based?

查看:43
本文介绍了为什么Kafka是基于拉的而不是基于推的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么 Kafka 是 pull-based 而不是 push-based?我同意 Kafka 提供了我所经历的高吞吐量,但我不知道如果基于推送,Kafka 吞吐量会如何下降.关于基于推送如何降低性能的任何想法?

Why is Kafka pull-based instead of push-based? I agree Kafka gives high throughput as I had experienced it, but I don't see how Kafka throughput would go down if it were to pushed based. Any ideas on how push-based can degrade performance?

推荐答案

在我们设计此类系统(拉与推)时,可扩展性是主要驱动因素.Kafka 具有很强的可扩展性.Kafka 的主要优势之一是可以非常轻松地添加大量消费者,而不会影响性能和停机时间.

Scalability was the major driving factor when we design such systems (pull vs push). Kafka is very scalable. One of the key benefits of Kafka is that it is very easy to add large number of consumers without affecting performance and without down time.

Kafka 可以以每秒 10 万+ 的速率处理来自生产者的事件.由于 Kafka 消费者从主题中提取数据,因此不同的消费者可以以不同的速度消费消息. Kafka 还支持不同的消费模型.您可以让一个消费者实时处理消息,另一个消费者以批处理模式处理消息.

Kafka can handle events at 100k+ per second rate coming from producers. Because Kafka consumers pull data from the topic, different consumers can consume the messages at different pace. Kafka also supports different consumption models. You can have one consumer processing the messages at real-time and another consumer processing the messages in batch mode.

另一个原因可能是 Kafka 不仅是为像 Hadoop 这样的单一消费者而设计的.不同的消费者可能有不同的需求和能力.

The other reason could be that Kafka was designed not only for single consumers like Hadoop. Different consumers can have diverse needs and capabilities.

基于拉取的系统存在一些缺陷,例如由于定期轮询而浪费资源.Kafka 支持长轮询"等待模式,直到真正的数据通过以缓解这个缺点.

Pull-based systems have some deficiencies like resources wasting due to polling regularly. Kafka supports a 'long polling' waiting mode until real data comes through to alleviate this drawback.

这篇关于为什么Kafka是基于拉的而不是基于推的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆