Kafka 消费者组再平衡 [英] Kafka Consumer group rebalancing

查看:49
本文介绍了Kafka 消费者组再平衡的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 kafka 消费者组管理来处理我的消息.

我的消息的处理时间各不相同.所以我将最大轮询间隔设置为 20 分钟,最大记录数为 20.并且我使用了 5 个分区和 5 个消费者实例,除了上述两个之外,还有默认配置值.

但我仍然间歇性地收到以下错误:

[Consumer clientId=consumer-3, groupId=amc_dashboard_analytics] 尝试心跳失败,因为组正在重新平衡

理解是,除非在达到消费者配置文档中所写的最大轮询间隔之前未调用轮询,否则不会发生重新平衡.但对我来说,重新平衡只发生在 20 分钟之前.

此外,在运行几个小时后,所有指定的消费者只是离开,说由于组正在重新平衡,因此尝试心跳失败"并且不会再次加入(理想情况下应该再次加入).

我在这里遗漏了什么吗?任何线索都会有所帮助.

解决方案

重新平衡的另一个原因是 session.timeout.ms 过期而不发送心跳.可以考虑增加这个consumer config.

来自 Kafka 文档:

<块引用>

heartbeat.interval.ms:心跳之间的预期时间使用 Kafka 的组管理工具时的消费者协调器.心跳用于确保消费者的会话保持活动状态并在新消费者加入或离开时促进重新平衡团体.该值必须设置为低于 session.timeout.ms,但通常应设置为不高于该值的 1/3.有可能调整得更低以控制正常的预期时间重新平衡.(默认:3000)

<小时><块引用>

session.timeout.ms:用于检测客户端故障的超时时间使用 Kafka 的组管理工具.客户端定期发送心跳以向经纪人表明其活跃度.如果没有心跳在此会话到期之前由经纪人收到超时,然后代理将从组中删除此客户端并启动再平衡.请注意,该值必须在允许范围内在代理配置中配置的范围由group.min.session.timeout.ms 和 group.max.session.timeout.ms.(默认:10000)

你可以查看这个链接 了解更多信息.

即使通过单独的线程在固定的时间间隔内发送心跳,在某些情况下,心跳也无法在session.timeout.ms中发送到代理.造成这种情况的一些可能原因是:

  • 网络问题
  • 在消费者或代理端停止世界垃圾收集

I'm using kafka consumer group management for processing my messages.

The processing time for my messages vary from one another. So I have set the max poll interval to 20 min for max records of 20. And I'm using 5 partition and 5 consumer instances with default config values apart from the above two.

But still I'm getting the following error intermittently:

[Consumer clientId=consumer-3, groupId=amc_dashboard_analytics] Attempt to heartbeat failed since group is rebalancing

The understanding is that rebalancing won't occur unless poll is not called before max poll interval is reached as written in consumer config Docs. But for me rebalancing occurs before 20 minutes only.

Also after few hours of running, all the assigned consumers just leave saying "Attempt to heartbeat failed since group is rebalancing" and do not join back again(Ideally should join back again).

Am I missing something here? Any leads would be helpful.

解决方案

Another reason of rebalance is expiring session.timeout.ms without sending heartbeat. You can consider to increase this consumer config.

From Kafka docs:

heartbeat.interval.ms: The expected time between heartbeats to the consumer coordinator when using Kafka's group management facilities. Heartbeats are used to ensure that the consumer's session stays active and to facilitate rebalancing when new consumers join or leave the group. The value must be set lower than session.timeout.ms, but typically should be set no higher than 1/3 of that value. It can be adjusted even lower to control the expected time for normal rebalances. (default: 3000)


session.timeout.ms: The timeout used to detect client failures when using Kafka's group management facility. The client sends periodic heartbeats to indicate its liveness to the broker. If no heartbeats are received by the broker before the expiration of this session timeout, then the broker will remove this client from the group and initiate a rebalance. Note that the value must be in the allowable range as configured in the broker configuration by group.min.session.timeout.ms and group.max.session.timeout.ms. (default: 10000)

You can check this link for more information.

Even if heartbeat is sent in fixed time intervals via separate thread, in some cases heartbeat cannot be sent to broker in session.timeout.ms. Some of the possible reasons of this situation is:

  • Network problem
  • stop-the-world garbage collection in consumer or broker sides

这篇关于Kafka 消费者组再平衡的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆