Kafka __consumer_offsets的大小在增加 [英] Kafka __consumer_offsets growing in size

查看:341
本文介绍了Kafka __consumer_offsets的大小在增加的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们将Kafka用作严格排序的队列,因此使用了single topic/single partition/single consumer group组合.我将来应该可以使用多个分区.

We are using Kafka as a Strictly Ordered Queue and hence a single topic/single partition/single consumer group combo is in use. I should be able to use multiple partition later in future.

我的使用者是spring-boot应用程序侦听器,它从相同的主题产生和使用.因此,消费者组是固定的,并且始终只有一个消费者.

My consumer is spring-boot app listener, that produces and consumes from the same topic(s). So the consumer group is fixed and there is always a single consumer.

Kafka version 0.10.1.1

在这种情况下,topic-0和一些__consumer_offsets_XX的日志文件会增长.实际上,即使应该每60分钟将其清除一次(默认情况下),__consumer_offsets_XX也会非常高.使用者不是一直读,而是有auto.commit.enabled=true

In such scenario the Log file for topic-0 and a few __consumer_offsets_XX grows. In fact __consumer_offsets_XX grows very high, even though it is supposed to be cleared periodically every 60 minutes (by default). The consumer doesn't read all the time but it has auto.commit.enabled=true

默认为log.retention.minutes(默认为7天)> offset.retention.minutes(默认为1天);但就我而言,由于我的消费群体/消费者是固定且单一的;一旦消息被使用,将消息保留在topic-0中可能没有任何意义.我是否可以将log.retention.minutes设置为少于3天(例如)?

By default, log.retention.minutes (default 7 days) > offset.retention.minutes (default 1 day); but in my case, since my consumer group/consumer is fixed and single; it may not make any sense to keep the messages in topic-0 once it is consumed. Shall I make log.retention.minutes as less as 3 days (say)?

是否可以降低offset.retention.minutes以便能够控制__consumer_offsets_XX的尺寸而无需触摸auto.commit设置?

Can I make the offset.retention.minutes lower to be able to control the growing size of the __consumer_offsets_XX w/o touching the auto.commit settings?

推荐答案

offsets.retention.minuteslog.retention.XXX属性仅在偏移文件

offsets.retention.minutes and log.retention.XXX properties will impact a physical removal of records/messages/logs only if offset file rolling occurs.

通常,offsets.retention.minutes属性指示如果消费者在指定的时间内消失了,经纪人应该忘记您的消费者,并且即使没有从磁盘上删除日志文件,它也可以做到这一点.

In general, offsets.retention.minutes property dictates that a broker should forget about your consumer if a consumer disappeared for the specified amount of time and it can do that even without removing log files from the disk.

如果将此值设置为相对较低的数字,并在没有活动的使用者的情况下检查您的__consumer_offsets主题,随着时间的流逝,您会发现类似以下内容:

If you set this value to a relatively low number and check your __consumer_offsets topic while there are no active consumers, over time you will notice something like:

    [group,topic,7]::OffsetAndMetadata(offset=7, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1557475923142, expireTimestamp=None)
    [group,topic,8]::OffsetAndMetadata(offset=6, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1557475923142, expireTimestamp=None)
    [group,topic,6]::OffsetAndMetadata(offset=7, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1557475923142, expireTimestamp=None)
    [group,topic,19]::NULL
    [group,topic,5]::NULL
    [group,topic,22]::NULL

这表示事件存储系统(例如Kafka)通常如何工作.他们记录新事件,而不是更改现有事件.

Which signifies how event store systems, like Kafka, work in general. They record new events, instead of changing the existing ones.

我不知道默认情况下每60分钟会删除/清理主题的任何Kafka版本,并且我感觉您对文档中的内容有误解.

I am not aware of any Kafka version where topics are deleted/cleaned up every 60 minutes by default and I have a feeling you misinterpreted something from the documentation.

__consumer_offsets的管理方式似乎与常规主题大不相同.删除__consumer_offsets的唯一方法是强制滚动其文件.但是,这种情况不会像常规日志文件那样发生.尽管常规日志文件(针对您的数据主题)在每次删除时都会自动滚动,而无论log.roll.属性如何,但__consumer_offsets都不会这样做.如果它们没有滚动并停留在初始...00000段,则它们根本不会被删除.因此,减少__consumer_offsets文件的方法似乎是:

It seems that the way __consumer_offsets are managed is very different from regular topics. The only way to get __consumer_offsets deleted is to force rolling of its files. That, however, doesn't happen same way it does for regular log files. While regular log files(for your data topics) are rolled automatically every time they are deleted, regardless of log.roll. property, __consumer_offsets don't do that. And if they are not rolled and stay at the initial ...00000 segment, they are not deleted at all. So, it seems the way to reduce your __consumer_offsets files is:

  1. 设置相对较小的log.roll.;
  2. 如果您有能力断开与消费者的联系,请操作offsets.retention.minutes
  3. 否则调整log.retention.XXX属性.
  1. Set relatively small log.roll. ;
  2. Manipulate offsets.retention.minutes if you can afford to disconnect your consumers;
  3. Otherwise adjust log.retention.XXX property.

这篇关于Kafka __consumer_offsets的大小在增加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆