Kafka - 保留期参数 [英] Kafka - Retention period Parameter

查看:23
本文介绍了Kafka - 保留期参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图了解 Apache Kafka 中保留期背后的逻辑.请帮助我了解以下场景的情况.

  1. 如果保留期设置为 0,会发生什么?是否会删除所有记录?
  2. 如果我们删除了保留参数本身,它会采用默认值吗?

解决方案

  1. Kafka 不允许您将保留期设置为零(以小时为单位).它必须至少为 1.如果您将其设置为零,您将收到以下错误消息,并且代理将无法启动.

<块引用>

java.lang.IllegalArgumentException:要求失败:log.retention.ms 必须是无限制 (-1) 或等于或大于 1

您仍然可以在使用参数 log.retention.minuteslog.retention.ms

时将其设置为零
  • 现在,让我们来到数据删除的点.在这种情况下,即使在设置的保留时间(例如 1 小时或 1 分钟)到期后,旧数据也不太可能被删除,因为 server.properties 中还有一个名为 log 的变量.segment.bytes 在那里扮演着重要的角色.log.segment.bytes 的值默认设置为 1GB.Kafka 只对关闭的段执行删除操作.所以,一旦一个日志段达到 1GB,它就会被关闭,然后保留开始.所以,你需要将 log.segment.bytes 的大小减小到某个近似值这最多是您计划在短时间内保留的数据的累积投资量的大小.例如.如果您的保留期为 10 分钟,并且每分钟获得大约 1 MB 的数据,那么您可以将 log.segment.bytes=10485760 设置为 1024 x 1024 x 10代码>.您可以在 这个线程.

  • 为了测试这一点,我们可以尝试一个小实验.让我们启动 Zookeeper 和 Kafka,创建一个名为 test 的主题并将其保留期更改为零.

    1) nohup ./zookeeper-server-start.sh ../config/zookeeper.properties &2) nohup ./kafka-server-start.sh ../config/server.properties &3) ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test4) ./kafka-configs.sh --zookeeper localhost:2181 --entity-type 主题 --entity-name test --alter --add-config log.retention.ms=0

  • 现在如果我们使用 Kafka-console-producer 插入足够多的记录,即使在 2-3 分钟后,我们也会看到记录没有被删除.但是现在,让我们将 log.segment.bytes 更改为 100 字节.

    5) ./kafka-configs.sh --zookeeper localhost:2181 --entity-type 主题 --entity-name test --alter --add-config segment.bytes=100

  • 现在,我们几乎马上就会看到旧记录从 Kafka 中删除.

  1. 是的.正如 server.properties 中的每个 Kafka 参数一样,如果我们删除/注释掉一个属性,该属性的默认值就会生效.我认为,默认保留期是 1 周.立>

Trying to understand the logic behind retention period in Apache Kafka. Please help me to understand the situation for the below scenarios.

  1. If retention period is set as 0, what will happen? Will all records be deleted?
  2. If we delete the retention parameter itself, will it take the default value?

解决方案

  1. Kafka doesn't allow you to set the retention period as zero, in units of hours. It has to be at-least 1. In case, you set it to zero, you'll get the following error message, and the broker won't start.

java.lang.IllegalArgumentException: requirement failed: log.retention.ms must be unlimited (-1) or, equal or greater than 1

You can still set it to zero while using the parameters log.retention.minutes or log.retention.ms

  • Now, let's come to the point of data deletion. In this situation, the old data won't likely get deleted even after the set retention (say 1 hr, or 1 min) has expired, because one more variable in server.properties called log.segment.bytes plays a major role there. The value of log.segment.bytes is set to 1GB by default. Kafka only performs deletion on a closed segment. So, once a log segment has reached 1GB, only then it is closed, and only after that the retention kicks in. So, you need to reduce the size of log.segment.bytes to some approximate value which is atmost the size of the cumulative investion volume of the data that you are planning to retain for that short duration. E.g. if your retention period is 10 min, and you get roughly 1 MB of data per minute, then you can set the log.segment.bytes=10485760 which is 1024 x 1024 x 10. You can find an example of how retention is dependent both on the data ingestion and time in this thread.

  • To test this, we can try a small experiment. Let's start Zookeeper and Kafka, create a topic called testand change its retention period to zero.

    1) nohup ./zookeeper-server-start.sh ../config/zookeeper.properties &
    2) nohup ./kafka-server-start.sh ../config/server.properties &
    3) ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
    4) ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name test --alter --add-config log.retention.ms=0
    

  • Now if we insert sufficient records using Kafka-console-producer, even after 2-3 minutes, we'll see the records are not deleted. But now, let's change the log.segment.bytes to 100 bytes.

    5) ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name test --alter --add-config segment.bytes=100 
    

  • Now, almost immediately we'll see that old records are getting deleted from Kafka.

  1. Yes. As it happens with every Kafka parameter in server.properties, if we delete/comment out a property, the default value for that property kicks in. I think, the default retention period is 1 week.

这篇关于Kafka - 保留期参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆