kafka ack=all 和 min-isr [英] kafka ack=all and min-isr

查看:187
本文介绍了kafka ack=all 和 min-isr的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

总结

Kafka 的文档和代码注释表明,当生产者设置 acks 设置为 all 时,只有在 all 时才会向生产者发送 ack同步副本已经赶上,但代码(Partition.ScalacheckEnoughReplicasReachOffset)似乎表明只要 就发送 ack最小同步副本已赶上.

The docs and code comments for Kafka suggest that when the producer setting acks is set to all then an ack will only be sent to the producer when all in-sync replicas have caught up, but the code (Partition.Scala, checkEnoughReplicasReachOffset) seems to suggest that the ack is sent as soon as min in-sync replicas have caught up.

详情

kafka 文档有这个:

The kafka docs have this:

acks=all 这意味着领导者将等待完整的同步副本集来确认记录.来源

acks=all This means the leader will wait for the full set of in-sync replicas to acknowledge the record. source

另外,查看 Kafka 源代码 - partition.scala checkEnoughReplicasReachOffset() 有以下注释(重点是我的):

Also, looking at the Kafka source code - partition.scala checkEnoughReplicasReachOffset() has the following comment (emphasis mine):

请注意,只有在 requiredAcks = -1 时才会调用此方法,并且我们正在等待 ISR 中的所有副本完全赶上与此生产请求对应的(本地)领导者的偏移量之前我们确认产品请求.

Note that this method will only be called if requiredAcks = -1 and we are waiting for all replicas in ISR to be fully caught up to the (local) leader's offset corresponding to this produce request before we acknowledge the produce request.

最后,这个答案关于堆栈溢出(再次强调我的)

Finally, this answer on Stack Overflow (emphasis mine again)

此外,最小同步副本设置还指定了需要同步以使分区保持可用于写入的最小副本数.当生产者指定 ack (-1/all config) 时,它仍然会等待来自所有同步副本的 ack 在那一刻(独立于最小同步副本的设置).

Also the min in-sync replica setting specifies the minimum number of replicas that need to be in-sync for the partition to remain available for writes. When a producer specifies ack (-1 / all config) it will still wait for acks from all in sync replicas at that moment (independent of the setting for min in-sync replicas).

但是当我查看 Partition.Scala 中的代码时(注意 minIsr ):

But when I look at the code in Partition.Scala (note minIsr < curInSyncReplicas.size):

def checkEnoughReplicasReachOffset(requiredOffset: Long): (Boolean, Errors) = {
  ...
  val minIsr = leaderReplica.log.get.config.minInSyncReplicas
  if (leaderReplica.highWatermark.messageOffset >= requiredOffset) {          
    if (minIsr <= curInSyncReplicas.size)
      (true, Errors.NONE)

调用这个的代码返回ack:

The code that calls this returns the ack:

if (error != Errors.NONE || hasEnough) {
  status.acksPending = false
  status.responseStatus.error = error
}

因此,一旦同步副本集大于最小同步副本,代码看起来就会返回 ack.但是,文档和评论表明只有在所有同步副本都赶上后才会发送 ack.我错过了什么?至少,checkEnoughReplicasReachOffset 上面的注释看起来应该改变.

So, the code looks like it returns an ack as soon as the in-sync replica set are greater than min in-sync replicas. However, the documentation and comments suggest that the ack is only sent once all in-sync replicas have caught up. What am I missing? At the very least, the comment above checkEnoughReplicasReachOffset looks like it should be changed.

推荐答案

感谢 jira-dev 邮件列表中的 Ismael.

Thanks to Ismael on the jira-dev mailing list.

重点是线:

if(leaderReplica.highWatermark.messageOffset >= requiredOffset) {

仅当 ISR 中的所有副本都具有特定偏移量时,高水印才会移动.

The high watermark only moves when all the replicas in ISR have that particular offset.

这篇关于kafka ack=all 和 min-isr的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆