Kafka __consumer_offsets的大小在增加 [英] Kafka __consumer_offsets growing in size
问题描述
我们将Kafka用作严格排序的队列,因此使用了single topic/single partition/single consumer group
组合.我将来应该可以使用多个分区.
We are using Kafka as a Strictly Ordered Queue and hence a single topic/single partition/single consumer group
combo is in use. I should be able to use multiple partition later in future.
我的使用者是spring-boot
应用程序侦听器,它从相同的主题产生和使用.因此,消费者组是固定的,并且始终只有一个消费者.
My consumer is spring-boot
app listener, that produces and consumes from the same topic(s). So the consumer group is fixed and there is always a single consumer.
Kafka version 0.10.1.1
在这种情况下,topic-0
和一些__consumer_offsets_XX
的日志文件会增长.实际上,即使应该每60分钟将其清除一次(默认情况下),__consumer_offsets_XX
也会非常高.使用者不是一直读,而是有auto.commit.enabled=true
In such scenario the Log file for topic-0
and a few __consumer_offsets_XX
grows. In fact __consumer_offsets_XX
grows very high, even though it is supposed to be cleared periodically every 60 minutes (by default). The consumer doesn't read all the time but it has auto.commit.enabled=true
默认为log.retention.minutes
(默认为7天)> offset.retention.minutes
(默认为1天);但就我而言,由于我的消费群体/消费者是固定且单一的;一旦消息被使用,将消息保留在topic-0
中可能没有任何意义.我是否可以将log.retention.minutes
设置为少于3天(例如)?
By default, log.retention.minutes
(default 7 days) > offset.retention.minutes
(default 1 day); but in my case, since my consumer group/consumer is fixed and single; it may not make any sense to keep the messages in topic-0
once it is consumed. Shall I make log.retention.minutes
as less as 3 days (say)?
是否可以降低offset.retention.minutes
以便能够控制__consumer_offsets_XX
的尺寸而无需触摸auto.commit
设置?
Can I make the offset.retention.minutes
lower to be able to control the growing size of the __consumer_offsets_XX
w/o touching the auto.commit
settings?
推荐答案
offsets.retention.minutes
和log.retention.XXX
属性仅在偏移文件
offsets.retention.minutes
and log.retention.XXX
properties will impact a physical removal of records/messages/logs only if offset file rolling occurs.
通常,offsets.retention.minutes
属性指示如果消费者在指定的时间内消失了,经纪人应该忘记您的消费者,并且即使没有从磁盘上删除日志文件,它也可以做到这一点.
In general, offsets.retention.minutes
property dictates that a broker should forget about your consumer if a consumer disappeared for the specified amount of time and it can do that even without removing log files from the disk.
如果将此值设置为相对较低的数字,并在没有活动的使用者的情况下检查您的__consumer_offsets
主题,随着时间的流逝,您会发现类似以下内容:
If you set this value to a relatively low number and check your __consumer_offsets
topic while there are no active consumers, over time you will notice something like:
[group,topic,7]::OffsetAndMetadata(offset=7, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1557475923142, expireTimestamp=None)
[group,topic,8]::OffsetAndMetadata(offset=6, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1557475923142, expireTimestamp=None)
[group,topic,6]::OffsetAndMetadata(offset=7, leaderEpoch=Optional.empty, metadata=, commitTimestamp=1557475923142, expireTimestamp=None)
[group,topic,19]::NULL
[group,topic,5]::NULL
[group,topic,22]::NULL
这表示事件存储系统(例如Kafka)通常如何工作.他们记录新事件,而不是更改现有事件.
Which signifies how event store systems, like Kafka, work in general. They record new events, instead of changing the existing ones.
我不知道默认情况下每60分钟会删除/清理主题的任何Kafka版本,并且我感觉您对文档中的内容有误解.
I am not aware of any Kafka version where topics are deleted/cleaned up every 60 minutes by default and I have a feeling you misinterpreted something from the documentation.
__consumer_offsets
的管理方式似乎与常规主题大不相同.删除__consumer_offsets
的唯一方法是强制滚动其文件.但是,这种情况不会像常规日志文件那样发生.尽管常规日志文件(针对您的数据主题)在每次删除时都会自动滚动,而无论log.roll.
属性如何,但__consumer_offsets
都不会这样做.如果它们没有滚动并停留在初始...00000
段,则它们根本不会被删除.因此,减少__consumer_offsets
文件的方法似乎是:
It seems that the way __consumer_offsets
are managed is very different from regular topics. The only way to get __consumer_offsets
deleted is to force rolling of its files. That, however, doesn't happen same way it does for regular log files. While regular log files(for your data topics) are rolled automatically every time they are deleted, regardless of log.roll.
property, __consumer_offsets
don't do that. And if they are not rolled and stay at the initial ...00000
segment, they are not deleted at all. So, it seems the way to reduce your __consumer_offsets
files is:
- 设置相对较小的
log.roll.
; - 如果您有能力断开与消费者的联系,请操作
offsets.retention.minutes
; - 否则调整
log.retention.XXX
属性.
- Set relatively small
log.roll.
; - Manipulate
offsets.retention.minutes
if you can afford to disconnect your consumers; - Otherwise adjust
log.retention.XXX
property.
这篇关于Kafka __consumer_offsets的大小在增加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!