kafka data log dir下有哪些不同的日志 [英] What are the different logs under kafka data log dir

查看:32
本文介绍了kafka data log dir下有哪些不同的日志的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试了解 kafka 数据日志.我可以看到在 logs.dir 中设置的目录下的日志为Topicname_partitionnumber".但是我想知道它下面捕获的不同日志是什么.以下是示例日志的屏幕截图.

I am trying to understand the kafka data logs. I can see the logs under the dir set in logs.dir as "Topicname_partitionnumber". However I would like to know what are the different logs captured under it. Below is the screenshot for a sample log.

推荐答案

在 Kafka 日志中,每个分区都有一个 log.dir 目录.每个分区被分割成段.

In Kafka logs, each partition has a log.dir directory. Each partition is split into segments.

段只是消息的集合.Kafka 不是将所有消息写入单个文件,而是将它们拆分为多个段.

A segment is just a collection of messages. Instead of writing all messages into a single file, Kafka splits them into chunks of segments.

每当 Kafka 写入分区时,它都会写入活动段.每个段都有定义的大小限制.当达到段大小限制时,它会关闭该段并打开一个变为活动的新段.根据配置,一个分区可以有一个或多个段.

Whenever Kafka writes to a partition, it writes to an active segment. Each segment has defined size limit. When the segment size limit is reached, it closes the segment and opens a new one that becomes active. One partition can have one or more segment based on the configuration.

每个段包含三个文件 - segment.log、segment.index 和 segment.timeindex

Each segment contains three files - segment.log,segment.index and segment.timeindex

每个 Kafka 主题分区有三种类型的文件:

There are three types of file for each Kafka topic partition:

-rw-r--r-- 1 kafka hadoop  10485760 Dec  3 23:57 00000000000000000000.index
-rw-r--r-- 1 kafka hadoop 148814230 Oct 11 06:50 00000000000000000000.log
-rw-r--r-- 1 kafka hadoop  10485756 Dec  3 23:57 00000000000000000000.timeindex

日志和索引文件前面的00000000000000000000是段的名称.它表示写入该段的第一条记录的偏移量.如果有 2 个段,即段 1 包含消息偏移量 0,1,段 2 包含消息偏移量 2 和 3.

The 00000000000000000000 in front of log and index files is the name of the segments. It represents the offset of the first record written in that segment. If there are 2 segments i.e. Segment 1 containing message offset 0,1 and Segment 2 containing message offset 2 and 3.

-rw-r--r-- 1 kafka hadoop  10485760 Dec  3 23:57 00000000000000000000.index
-rw-r--r-- 1 kafka hadoop 148814230 Oct 11 06:50 00000000000000000000.log
-rw-r--r-- 1 kafka hadoop  10485756 Dec  3 23:57 00000000000000000000.timeindex
-rw-r--r-- 1 kafka hadoop  10485760 Dec  3 23:57 00000000000000000002.index
-rw-r--r-- 1 kafka hadoop 148814230 Oct 11 06:50 00000000000000000002.log
-rw-r--r-- 1 kafka hadoop  10485756 Dec  3 23:57 00000000000000000002.timeindex

.log 文件存储偏移量、消息的物理位置、时间戳以及消息内容.在以特定偏移量从 Kafka 读取消息时,在巨大的日志文件中找到偏移量成为一项昂贵的任务.这就是 .index 文件变得有用的地方.它将消息的偏移量和物理位置存储在日志文件中.

.log file stores the offset, the physical position of the message, timestamp along with the message content. While reading the messages from Kafka at a particular offset, it becomes an expensive task to find the offset in a huge log file. That's where .index the file becomes useful. It stores the offsets and physical position of the messages in the log file.

.timeindex 该文件基于消息的时间戳.

.timeindex the file is based on the timestamp of messages.

这篇关于kafka data log dir下有哪些不同的日志的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆