为什么我的kafka tmp文件夹的大小几乎与磁盘大小相同? [英] Why my kafka tmp folder have almost same size than disk size?

查看：283 发布时间：2021/2/14 19:58:27 apache-kafka kafka-consumer-api apache-kafka-connect

本文介绍了为什么我的kafka tmp文件夹的大小几乎与磁盘大小相同?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我以这种形式开发生产kafka环境:3个ZK服务器，3个Kafka代理和2个kafka连接.我将我的tmp文件与我的kafka主文件夹并排放置.我在远程ubuntu环境中运行它，但不在docker中运行.

I develop production kafka environment with this formation: 3 ZK server, 3 Kafka brokers and Two kafka connect. I put my tmp file side-by-side with my kafka main folder. And I run it in remote ubuntu environment but not in docker.

当我进行kafka操作时，遇到错误，通知我的磁盘消耗过多.我检查我的kafka tmp文件夹，其大小大约是磁盘大小的2/3，这将关闭我的kafka群集.

When i operate my kafka operation, i experienced error which inform my disk are consumed too much. I check my kafka tmp folder that the size is about almost 2/3 of my disk size, which turn off my kafka cluster.

我检查了每个kafka log_folder并发现了这一点:

I have inspect for each kafka log_folder and found this:

25名connect_offset，每人@ 21MB
25 connect_offset2，每人每个@ 21MB
25 connect_status来自每个工人@ 21MB
25号工人[c3]来自2号工人，每个工人@ 21MB
50 __consumer_offset @ 21MB
每个主题每个主题的偏移量为@ 21Mb，我有2个主题，所以我有6个主题偏移量

25 connect_offset from workers no.1 @21MB for each one
25 connect_offset2 from workers no.2 @21MB for each one
25 connect_status from workers no.1 @21MB for each one
25 connect_status2 from workers no.2 @21MB for each one
50 __consumer_offset from both workers @21MB for each one
topics offset @21Mb for each one per topics, which I have 2 topics so I have 6 topics offset

问题是__consumer_offset消耗的磁盘数量大于其他偏移量，而我的kafka_config无法处理它.这是我的kafka_configuration的示例:

The problem is the number of __consumer_offset is consume more disk than the other offset, and my kafka_config cannot handle it. This is the example of my kafka_configuration:

broker.id=101
port=9099
listeners=PLAINTEXT://0.0.0.0:9099
advertised.listeners=PLAINTEXT://127.0.0.1:9099
num.partitions=3
offsets.topic.replication.factor=3
log.dir=/home/xxx/tmp/kafka_log1
log.cleaner.enable=true
log.cleanup.policy=delete
log.retention.bytes=1073741824
log.segment.bytes=1073741824
log.retention.check.interval.ms=60000
message.max.bytes=1073741824
zookeeper.connect=xxx:2185,xxx:2186,xxx:2187
zookeeper.connection.timeout.ms=7200000
session.time.out.ms=30000
delete.topic.enable=true

对于每个主题，这是配置:

And for each topics, this is the config:

kafka-topics.sh -create --zookeeper xxx:2185,xxx:216,xxx:2187 --replication-factor 3 --partitions 3 --topic $topic_name --config cleanup.policy=delete --config retention.ms=86400000 --config min.insync.replicas=2 --config compression.type=gzip

和这样的连接配置(连接配置共享相同的配置，除了端口和偏移量以及状态配置.):

And the connect config like this (connect config share identical config except port and offset and status config.):

bootstrap.servers=XXX:9099,XXX:9098,XXX:9097
group.id=XXX
key.converter.schemas.enable=true
value.converter.schemas.enable=true
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
offset.storage.topic=connect-offsets
offset.storage.replication.factor=3
config.storage.topic=connect-configs
config.storage.replication.factor=3
status.storage.topic=connect-status
status.storage.replication.factor=3
offset.flush.timeout.ms=300000
rest.host.name=xxx
rest.port=8090
connector.client.config.override.policy=All
producer.max.request.size=1073741824
producer.ack=all
producer.enable.idempotence=true
consumer.max.partition.fetch.bytes=1073741824
consumer.auto.offset.reset=latest
consumer.enable.auto.commit=true
consumer.max.poll.interval.ms=5000000
plugin.path=/xxx/connectors

很明显，根据一些文档，Kafka不需要大的磁盘空间(记录的最大tmp是36 GB).

It's very obvious that according to several documentation, Kafka doesn't need large disk space (the largest recorded tmp is 36 GB).

为什么我的kafka tmp文件夹的大小几乎与磁盘大小相同? [英] Why my kafka tmp folder have almost same size than disk size?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为什么我的kafka tmp文件夹的大小几乎与磁盘大小相同? [英] Why my kafka tmp folder have almost same size than disk size?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭