Kafka 分区中的消息分布不均 [英] Uneven Distribution of messages in Kafka Partitions

查看:40
本文介绍了Kafka 分区中的消息分布不均的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的主题有 10 个分区,1 个消费者组有 4 个消费者,工作线程大小为 3.

I have a topic with 10 partitions, 1 consumer group with 4 consumers and worker size is 3.

我可以看到分区中的消息分布不均匀,一个分区有很多数据,另一个是空闲的.

I could see there is an uneven distribution of messages in the partitions, One partition is having so much data and another one is free.

如何让我的生产者将负载平均分配到所有分区,从而使所有分区都得到正确利用?

How can I make my producer to evenly distribute the load into all the partitions, so that all partitions are being utilized properly?

推荐答案

根据 DefaultPartitioner 类本身的 JavaDoc 注释,默认分区策略为:

According to the JavaDoc comment in the DefaultPartitioner class itself, the default partitioning strategy is:

  • 如果记录中指定了分区,则使用它.
  • 如果未指定分区但存在键,请根据键的哈希值选择分区.
  • 如果不存在分区或键,请以循环方式选择一个分区.

https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/producer/internals/DefaultPartitioner.java

因此,这里有两个可能导致分布不均的原因,具体取决于您在生成消息时是否指定了密钥:

So here are two possible reasons that may be causing the uneven distribution, depending on whether you are specifying a key while producing the message or not:

  • 如果您指定了一个键并且使用 DefaultPartitioner 得到了不均匀分布,最明显的解释是您多次指定了相同的键.

  • If you are specifying a key and you are getting an uneven distribution using the DefaultPartitioner, the most apparent explanation would be that you are specifying the same key multiple times.

如果您未指定键并使用 DefaultPartitioner,则可能会发生不明显的行为.根据上面的内容,您会期望消息的循环分发,但这不一定是这种情况.0.8.0 中引入的优化可能会导致使用相同的分区.查看此链接以获得更详细的解释:https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanoteevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified? .

If you are not specifying a key and using the DefaultPartitioner, a non-obvious behavior could be happening. According to the above you would expect round-robin distribution of messages, but this is not necessarily the case. An optimization introduced in 0.8.0 could be causing the same partition to be used. Check this link for a more detailed explanation: https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified? .

这篇关于Kafka 分区中的消息分布不均的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆