Kafka-保留期限参数 [英] Kafka - Retention period Parameter

查看:43
本文介绍了Kafka-保留期限参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图了解Apache Kafka保留期背后的逻辑.请帮助我了解以下情况的情况.

Trying to understand the logic behind retention period in Apache Kafka. Please help me to understand the situation for the below scenarios.

  1. 如果保留期设置为0,会发生什么?是否会删除所有记录?
  2. 如果我们删除保留参数本身,它将采用默认值吗?

推荐答案

  1. Kafka不允许您将保留期限设置为零(以小时为单位).它必须至少为1.如果将其设置为零,则会收到以下错误消息,并且代理将无法启动.

java.lang.IllegalArgumentException:要求失败:log.retention.ms必须为无限制(-1)或等于或大于1

java.lang.IllegalArgumentException: requirement failed: log.retention.ms must be unlimited (-1) or, equal or greater than 1

您仍然可以在使用参数 log.retention.minutes log.retention.ms

You can still set it to zero while using the parameters log.retention.minutes or log.retention.ms

  • 现在,让我们来谈谈数据删除的问题.在这种情况下,即使在设置的保留时间(例如1小时或1分钟)到期后,旧数据也不太可能被删除,因为 server.properties 中还有一个名为 log的变量.segment.bytes 在此起主要作用.默认情况下, log.segment.bytes 的值设置为1GB.Kafka仅在封闭的路段上执行删除.因此,一旦日志段达到1GB,就只有关闭它,然后才开始保留.因此,您需要将 log.segment.bytes 的大小减小到某个近似值这最多是您打算在短期内保留的数据的累计投资量的大小.例如.如果保留时间为10分钟,并且每分钟大约获得1 MB数据,则可以将 log.segment.bytes = 10485760 设置为 1024 x 1024 x 10 .您可以在

  • Now, let's come to the point of data deletion. In this situation, the old data won't likely get deleted even after the set retention (say 1 hr, or 1 min) has expired, because one more variable in server.properties called log.segment.bytes plays a major role there. The value of log.segment.bytes is set to 1GB by default. Kafka only performs deletion on a closed segment. So, once a log segment has reached 1GB, only then it is closed, and only after that the retention kicks in. So, you need to reduce the size of log.segment.bytes to some approximate value which is atmost the size of the cumulative investion volume of the data that you are planning to retain for that short duration. E.g. if your retention period is 10 min, and you get roughly 1 MB of data per minute, then you can set the log.segment.bytes=10485760 which is 1024 x 1024 x 10. You can find an example of how retention is dependent both on the data ingestion and time in this thread.

要对此进行测试,我们可以尝试一个小实验.让我们启动Zookeeper和Kafka,创建一个名为 test 的主题,并将其保留期更改为零.

To test this, we can try a small experiment. Let's start Zookeeper and Kafka, create a topic called testand change its retention period to zero.

1) nohup ./zookeeper-server-start.sh ../config/zookeeper.properties &
2) nohup ./kafka-server-start.sh ../config/server.properties &
3) ./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
4) ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name test --alter --add-config log.retention.ms=0

  • 现在,如果我们使用Kafka-console-producer插入足够的记录,即使在2-3分钟后,我们也会看到记录未删除.但是现在,让我们将 log.segment.bytes 更改为100字节.

    5) ./kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name test --alter --add-config segment.bytes=100 
    

  • 现在,几乎立即,我们会发现旧记录已从Kafka中删除.

  • Now, almost immediately we'll see that old records are getting deleted from Kafka.

    1. 是的.由于 server.properties 中的每个Kafka参数都会发生这种情况,因此,如果我们删除/注释掉某个属性,则该属性的默认值会生效.我认为,默认保留期为1周.
    1. Yes. As it happens with every Kafka parameter in server.properties, if we delete/comment out a property, the default value for that property kicks in. I think, the default retention period is 1 week.

    这篇关于Kafka-保留期限参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆