Kafka 甚至在达到段大小之前删除段 [英] Kafka Deletes segments even before segment size is reached

查看:24
本文介绍了Kafka 甚至在达到段大小之前删除段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在本地机器上玩Kafka,我添加了以下Topic配置:

I'm playing with Kafka in my local machine, and I have added the following Topic configuration:

bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1 config retention.ms=60000
bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1 —config file.delete.delay.ms=40000
bin/kafka-topics.sh --zookeeper localhost:2181 --alter --topic topic1 --config segment.bytes=400000

我的理解是,当段达到上面定义的段大小(segment.bytes=400000)加上段中的每条消息都比上面定义的保留时间更旧时,段将被删除(retention.ms=60000).

My understanding is a segment will be deleted when the segment reaches out the segment size above defined (segment.bytes=400000) PLUS every single message within the segment is older than the retention time above defined (retention.ms=60000).

我注意到一个只有 35 个字节的片段,其中只包含一条消息,一分钟后被删除(可能多一点)

What I noticed is a segment of just 35 bytes, which conteined just one message, was deleted after the minute (maybe a little more)

我从哪里得到这些信息?来自 Linkedin 工程师关于删除过程如何工作的帖子:

Where I get that information? from a post that a Linkedin Engineer made about how the deletion process works:

保留将基于保留的组合和段大小设置(作为旁注,建议使用log.retention.ms 和 log.segment.ms,而不是小时配置.那是出于遗留原因,但 ms 配置更加一致).作为消息由Kafka接收,它们被写入当前打开每个分区的日志段.该段在以下任一情况下旋转已达到 log.segment.bytes 或 log.segment.ms 限制.一次发生这种情况时,日志段将关闭并打开一个新段.仅有的日志段关闭后是否可以通过retention删除设置.一旦日志段关闭并且所有消息段中的时间早于 log.retention.ms 或总分区size 大于 log.retention.bytes,则日志段为已清除.

Retention is going to be based on a combination of both the retention and segment size settings (as a side note, it's recommended to use log.retention.ms and log.segment.ms, not the hours config. That's there for legacy reasons, but the ms configs are more consistent). As messages are received by Kafka, they are written to the current open log segment for each partition. That segment is rotated when either the log.segment.bytes or the log.segment.ms limit is reached. Once that happens, the log segment is closed and a new one is opened. Only after a log segment is closed can it be deleted via the retention settings. Once the log segment is closed AND either all the messages in the segment are older than log.retention.ms OR the total partition size is greater than log.retention.bytes, then the log segment is purged.

链接:保留的工作原理

推荐答案

你错过了对你引用的一些陈述的解释:

You miss interpret some of the statements you cite:

当达到 log.segment.bytes 或 log.segment.ms 限制时,该段将被轮换.

That segment is rotated when either the log.segment.bytes or the log.segment.ms limit is reached.

这清楚地表明旋转可以由大小或时间触发.它是,而不是.

This clearly says rotation can be triggered by size or time. It's or, not and.

一旦发生这种情况,日志段就会关闭.[...] 一旦日志段关闭并且该段中的所有消息都比 log.retention.ms 旧或总分区大小大于 log.retention.bytes,则该日志段将被清除.

Once that happens, the log segment is closed. [...] Once the log segment is closed AND either all the messages in the segment are older than log.retention.ms OR the total partition size is greater than log.retention.bytes, then the log segment is purged.

这样,当一个segment被时间触发旋转关闭后,它可以被删除而不管它的大小.

Thus, after a segment got closed by rotating triggered by time, it can be deleted regardless of its size.

这篇关于Kafka 甚至在达到段大小之前删除段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆