kafka ack=all 和 min-isr [英] kafka ack=all and min-isr
问题描述
总结
Kafka 的文档和代码注释表明,当生产者设置 acks
设置为 all
时,只有在 all 时才会向生产者发送 ack同步副本已经赶上,但代码(Partition.Scala
、checkEnoughReplicasReachOffset
)似乎表明只要 就发送 ack最小同步副本已赶上.
The docs and code comments for Kafka suggest that when the producer setting acks
is set to all
then an ack will only be sent to the producer when all in-sync replicas have caught up, but the code (Partition.Scala
, checkEnoughReplicasReachOffset
) seems to suggest that the ack is sent as soon as min in-sync replicas have caught up.
详情
kafka 文档有这个:
The kafka docs have this:
acks=all 这意味着领导者将等待完整的同步副本集来确认记录.来源
acks=all This means the leader will wait for the full set of in-sync replicas to acknowledge the record. source
另外,查看 Kafka 源代码 - partition.scala
checkEnoughReplicasReachOffset()
有以下注释(重点是我的):
Also, looking at the Kafka source code - partition.scala
checkEnoughReplicasReachOffset()
has the following comment (emphasis mine):
请注意,只有在 requiredAcks = -1 时才会调用此方法,并且我们正在等待 ISR 中的所有副本完全赶上与此生产请求对应的(本地)领导者的偏移量之前我们确认产品请求.
Note that this method will only be called if requiredAcks = -1 and we are waiting for all replicas in ISR to be fully caught up to the (local) leader's offset corresponding to this produce request before we acknowledge the produce request.
最后,这个答案关于堆栈溢出(再次强调我的)
Finally, this answer on Stack Overflow (emphasis mine again)
此外,最小同步副本设置还指定了需要同步以使分区保持可用于写入的最小副本数.当生产者指定 ack (-1/all config) 时,它仍然会等待来自所有同步副本的 ack 在那一刻(独立于最小同步副本的设置).
Also the min in-sync replica setting specifies the minimum number of replicas that need to be in-sync for the partition to remain available for writes. When a producer specifies ack (-1 / all config) it will still wait for acks from all in sync replicas at that moment (independent of the setting for min in-sync replicas).
但是当我查看 Partition.Scala 中的代码时(注意 minIsr
But when I look at the code in Partition.Scala (note minIsr < curInSyncReplicas.size
):
def checkEnoughReplicasReachOffset(requiredOffset: Long): (Boolean, Errors) = {
...
val minIsr = leaderReplica.log.get.config.minInSyncReplicas
if (leaderReplica.highWatermark.messageOffset >= requiredOffset) {
if (minIsr <= curInSyncReplicas.size)
(true, Errors.NONE)
调用这个的代码返回ack:
The code that calls this returns the ack:
if (error != Errors.NONE || hasEnough) {
status.acksPending = false
status.responseStatus.error = error
}
因此,一旦同步副本集大于最小同步副本,代码看起来就会返回 ack.但是,文档和评论表明只有在所有同步副本都赶上后才会发送 ack.我错过了什么?至少,checkEnoughReplicasReachOffset
上面的注释看起来应该改变.
So, the code looks like it returns an ack as soon as the in-sync replica set are greater than min in-sync replicas. However, the documentation and comments suggest that the ack is only sent once all in-sync replicas have caught up. What am I missing? At the very least, the comment above checkEnoughReplicasReachOffset
looks like it should be changed.
推荐答案
感谢 jira-dev 邮件列表中的 Ismael.
Thanks to Ismael on the jira-dev mailing list.
重点是线:
if(leaderReplica.highWatermark.messageOffset >= requiredOffset) {
仅当 ISR 中的所有副本都具有特定偏移量时,高水印才会移动.
The high watermark only moves when all the replicas in ISR have that particular offset.
这篇关于kafka ack=all 和 min-isr的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!