如何确定Kafka群集大小 [英] How to decide Kafka Cluster size

查看:93
本文介绍了如何确定Kafka群集大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我计划决定在Kafka群集上应存在多少个节点。我不确定要考虑的参数。我确信它必须> = 3(复制因子为2,容错为1个节点)。

I am planning to decide on how many nodes should be present on Kafka Cluster. I am not sure about the parameters to take into consideration. I am sure it has to be >=3 (with replication factor of 2 and failure tolerance of 1 node).

有人可以告诉我应该记住哪些参数

Can someone tell me what parameters should be kept in mind while deciding the cluster size and how they effect the size.

我知道以下因素,但不知道它如何定量地影响群集大小。我知道它如何定性地影响群集的大小。还有其他影响群集大小的参数吗?

1.复制因子(群集大小> =复制因子)
2.节点故障容限。 (群集大小> =节点故障+ 1)

I know of following factors but don't know how it quantitatively effects the cluster size. I know how it qualitatively effect the cluster size. Is there any other parameter which effects cluster size? 1. Replication factor (cluster size >= replication factor) 2. Node failure tolerance. (cluster size >= node-failure + 1)

在考虑所有参数

1.有3个主题。
2.每个主题都有不同大小的消息。邮件大小范围是10到500kb。平均邮件大小为50kb。
3.每个主题都有不同的分区。分区为10、100、500
4.保留期为7天
5.每个主题每天都会发布1亿条消息。

有人可以请我指向相关文档或可能讨论此问题的任何其他博客。我已经在Google上进行搜索,但无济于事

Can someone please point me to relevant documentation or any other blog which may discuss this. I have google searched it but to no avail

推荐答案

据我了解,从Kafka获得良好的吞吐量并不仅仅取决于簇的大小;还有其他配置也需要考虑。我将尽我所能分享。

As I understand, getting good throughput from Kafka doesn't depend only on the cluster size; there are others configurations which need to be considered as well. I will try to share as much as I can.

Kafka的吞吐量应该与您拥有的磁盘数量成线性比例关系。 Kafka 0.8中引入了新的多个数据目录功能,使Kafka的主题可以在不同的计算机上具有不同的分区。随着分区数量的大幅增加,领导人选举过程变慢的机会也会增加,这也会影响消费者的平衡。这是要考虑的问题,并且可能是瓶颈。

Kafka's throughput is supposed to be linearly scalabale with the numbers of disk you have. The new multiple data directories feature introduced in Kafka 0.8 allows Kafka's topics to have different partitions on different machines. As the partition number increases greatly, so do the chances that the leader election process will be slower, also effecting consumer rebalancing. This is something to consider, and could be a bottleneck.

另一项关键因素可能是磁盘刷新率。由于Kafka总是立即将所有数据写入文件系统,因此数据越频繁地刷新到磁盘,Kafka越会搜索绑定,并且吞吐量越低。再次,非常低的刷新率可能会导致不同的问题,因为在这种情况下,要刷新的数据量将很大。因此,提供准确的数字不是很实用,我认为这就是您在Kafka文档中找不到如此直接答案的原因。

Another key thing could be the disk flush rate. As Kafka always immediately writes all data to the filesystem, the more often data is flushed to disk, the more "seek-bound" Kafka will be, and the lower the throughput. Again a very low flush rate might lead to different problems, as in that case the amount of data to be flushed will be large. So providing an exact figure is not very practical and I think that is the reason you couldn't find such direct answer in the Kafka documentation.

还有其他因素。例如,消费者的获取大小,压缩,异步生产者的批处理大小,套接字缓冲区大小等。

There will be other factors too. For example the consumer's fetch size, compressions, batch size for asynchronous producers, socket buffer sizes etc.

硬件和设备;操作系统也将在其中发挥关键作用,因为建议使用在基于Linux的环境中使用Kafka,因为它具有用于将数据写入磁盘的pageCache机制。在此处

Hardware & OS will also play a key role in this as using Kafka in a Linux based environment is advisable due to its pageCache mechanism for writing data to the disk. Read more on this here

您可能还想看看操作系统刷新行为如何在考虑中起关键作用,然后根据实际需要进行调整。我认为了解设计理念非常关键,这使它在吞吐量和容错性方面如此有效。

You might also want to take a look at how OS flush behavior play a key role into consideration before you actually tune it to fit your needs. I believe it is key to understand the design philosophy, which makes it so effective in terms of throughput and fault-tolerance.

  • https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
  • http://blog.liveramp.com/2013/04/08/kafka-0-8-producer-performance-2/
  • https://grey-boundary.io/load-testing-apache-kafka-on-aws/
  • https://cwiki.apache.org/confluence/display/KAFKA/Performance+testing

这篇关于如何确定Kafka群集大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆