Kafka甚至会在达到分段大小之前删除分段 [英] Kafka Deletes segments even before segment size is reached

查看:78
本文介绍了Kafka甚至会在达到分段大小之前删除分段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在本地计算机上使用Kafka,并且添加了以下主题配置:

I'm playing with Kafka in my local machine, and I have added the following Topic configuration:

bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1 config retention.ms=60000
bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1 —config file.delete.delay.ms=40000
bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1 --config segment.bytes=400000

我的理解是,当该段达到超出定义的段大小(segment.bytes = 400000)时,该段将被删除,再加上该段中的每条消息都超过了所定义的保留时间(retention.ms = 60000)

My understanding is a segment will be deleted when the segment reaches out the segment size above defined (segment.bytes=400000) PLUS every single message within the segment is older than the retention time above defined (retention.ms=60000).

我注意到只有35个字节的段,其中包含一条消息,在一分钟后被删除了(也许更多)

What I noticed is a segment of just 35 bytes, which conteined just one message, was deleted after the minute (maybe a little more)

我从哪里获得这些信息?摘自Linkedin工程师关于删除过程如何工作的帖子:

Where I get that information? from a post that a Linkedin Engineer made about how the deletion process works:

保留将基于两种保留的组合和细分尺寸设置(注意,建议使用log.retention.ms和log.segment.ms,而不是小时配置.那是出于遗留原因,但ms配置更加一致).作为卡夫卡收到邮件,并将其写入当前打开的邮件每个分区的日志段.当任一已达到log.segment.bytes或log.segment.ms限制.一次发生这种情况时,将关闭日志段并打开一个新的日志段.仅有的关闭日志段后,可以通过保留时间将其删除设置.一旦日志段关闭并且所有消息都关闭该段中的时间早于log.retention.ms或总分区如果大小大于log.retention.bytes,则日志段为清除.

Retention is going to be based on a combination of both the retention and segment size settings (as a side note, it's recommended to use log.retention.ms and log.segment.ms, not the hours config. That's there for legacy reasons, but the ms configs are more consistent). As messages are received by Kafka, they are written to the current open log segment for each partition. That segment is rotated when either the log.segment.bytes or the log.segment.ms limit is reached. Once that happens, the log segment is closed and a new one is opened. Only after a log segment is closed can it be deleted via the retention settings. Once the log segment is closed AND either all the messages in the segment are older than log.retention.ms OR the total partition size is greater than log.retention.bytes, then the log segment is purged.

链接:保留的工作原理

推荐答案

您错过了解释某些引用的语句:

You miss interpret some of the statements you cite:

达到log.segment.bytes或log.segment.ms限制时,该段将旋转.

That segment is rotated when either the log.segment.bytes or the log.segment.ms limit is reached.

这显然表明旋转可以由大小或时间触发.是,而不是.

This clearly says rotation can be triggered by size or time. It's or, not and.

一旦发生,日志段将关闭.[...]关闭日志段,并且该段中的所有消息都早于log.retention.ms或总分区大小大于log.retention.bytes,则清除该日志段.

因此,在通过时间触发旋转关闭线段后,无论其大小如何都可以将其删除.

Thus, after a segment got closed by rotating triggered by time, it can be deleted regardless of its size.

这篇关于Kafka甚至会在达到分段大小之前删除分段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆