Kafka 的 session.timeout.ms 和 max.poll.interval.ms 之间的差异 >= 0.10.1 [英] Difference between session.timeout.ms and max.poll.interval.ms for Kafka >= 0.10.1

查看:74
本文介绍了Kafka 的 session.timeout.ms 和 max.poll.interval.ms 之间的差异 >= 0.10.1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不清楚为什么我们需要 session.timeout.msmax.poll.interval.ms 以及我们什么时候使用其中一个或两者?似乎这两个设置都表明协调器在假设消费者死亡之前等待从消费者那里获得心跳的时间上限.

还有基于 KIP-62?

解决方案

在 KIP-62 之前,只有 session.timeout.ms(即 Kafka 0.10.0> 和更早).max.poll.interval.ms 是通过 KIP-62(Kafka 0.10.1 的一部分).

KIP-62,通过后台心跳线程将心跳与对poll() 的调用分离,从而允许更长的处理时间(即两个连续poll() 之间的时间)>) 比心跳间隔.

假设处理一条消息需要 1 分钟.如果heartbeat 和poll 耦合(即KIP-62 之前),则需要将session.timeout.ms 设置为大于1 分钟以防止消费者超时.但是,如果消费者死亡,则检测失败消费者也需要 1 分钟以上的时间.

KIP-62 将轮询和心跳分离,允许在两次连续轮询之间发送心跳.现在您有两个线程在运行,心跳线程和处理线程,因此,KIP-62 为每个线程引入了超时.session.timeout.ms 用于心跳线程,max.poll.interval.ms 用于处理线程.

假设您设置了 session.timeout.ms=30000,因此,消费者心跳线程必须在此时间到期之前向代理发送心跳.另一方面,如果处理单个消息需要 1 分钟,您可以将 max.poll.interval.ms 设置为大于一分钟,以便给处理线程更多的时间来处理消息.>

如果处理线程终止,则需要 max.poll.interval.ms 来检测到这一点.然而,如果整个消费者死亡(并且一个死亡的处理线程很可能使包括心跳线程在内的整个消费者崩溃),它只需要 session.timeout.ms 来检测它.

这个想法是,即使处理本身需要很长时间,也能快速检测到失败的消费者.

实现细节

新的超时max.poll.interval.ms主要是一个客户端的概念:如果poll()max.poll.interval内没有被调用.ms,心跳线程会检测到这种情况并向代理发送离开组请求.-- max.poll.intervalms 仍然与消费者组重新平衡相关:如果触发重新平衡,消费者有 max.poll.interval.ms 时间重新加入通过调用触发加入组请求的 poll() 客户端进行分组.

I am unclear why we need both session.timeout.ms and max.poll.interval.ms and when would we use one or the other or both? It seems like both settings indicate the upper bound on the time the coordinator will wait to get the heartbeat from a consumer before assuming it's dead.

Also how does it behave for versions 0.10.1.0+ based on KIP-62?

解决方案

Before KIP-62, there is only session.timeout.ms (ie, Kafka 0.10.0 and earlier). max.poll.interval.ms is introduced via KIP-62 (part of Kafka 0.10.1).

KIP-62, decouples heartbeats from calls to poll() via a background heartbeat thread, allowing for a longer processing time (ie, time between two consecutive poll()) than heartbeat interval.

Assume processing a message takes 1 minute. If heartbeat and poll are coupled (ie, before KIP-62), you will need to set session.timeout.ms larger than 1 minute to prevent consumer to time out. However, if a consumer dies, it also takes longer than 1 minute to detect the failed consumer.

KIP-62 decouples polling and heartbeat allowing to send heartbeats between two consecutive polls. Now you have two threads running, the heartbeat thread and the processing thread and thus, KIP-62 introduced a timeout for each. session.timeout.ms is for the heartbeat thread while max.poll.interval.ms is for the processing thread.

Assume, you set session.timeout.ms=30000, thus, the consumer heartbeat thread must sent a heartbeat to the broker before this time expires. On the other hand, if processing of a single message takes 1 minutes, you can set max.poll.interval.ms larger than one minute to give the processing thread more time to process a message.

If the processing thread dies, it takes max.poll.interval.ms to detect this. However, if the whole consumer dies (and a dying processing thread most likely crashes the whole consumer including the heartbeat thread), it takes only session.timeout.ms to detect it.

The idea is, to allow for a quick detection of a failing consumer even if processing itself takes quite long.

Implemenation Detail

The new timeout max.poll.interval.ms is mainly a client side concept: if poll() is not called within max.poll.interval.ms, the heartbeat thread will detect this case and send a leave-group request to the broker. -- max.poll.intervalms is still relevant for consumer group rebalances: if a rebalance is triggered, consumers have max.poll.interval.ms time to re-join the group by calling poll() client side which triggers a join-group request.

这篇关于Kafka 的 session.timeout.ms 和 max.poll.interval.ms 之间的差异 >= 0.10.1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆