kafka数据日志目录下有哪些不同的日志 [英] What are the different logs under kafka data log dir

查看:42
本文介绍了kafka数据日志目录下有哪些不同的日志的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试了解kafka数据日志.我可以看到在logs.dir中设置为"Topicname_partitionnumber"的目录下的日志.但是,我想知道在其下捕获的不同日志是什么.下面是示例日志的屏幕截图.

I am trying to understand the kafka data logs. I can see the logs under the dir set in logs.dir as "Topicname_partitionnumber". However I would like to know what are the different logs captured under it. Below is the screenshot for a sample log.

推荐答案

在Kafka日志中,每个分区都有一个log.dir目录.每个分区都分为多个部分.

In Kafka logs, each partition has a log.dir directory. Each partition is split into segments.

段只是消息的集合.卡夫卡没有将所有消息都写到一个文件中,而是将它们分成了几段.

A segment is just a collection of messages. Instead of writing all messages into a single file, Kafka splits them into chunks of segments.

每当Kafka写入分区时,它都会写入活动段.每个段都有定义的大小限制.当达到分段大小限制时,它将关闭分段并打开一个新的分段,该分段将变为活动状态.一个分区可以根据配置具有一个或多个段.

Whenever Kafka writes to a partition, it writes to an active segment. Each segment has defined size limit. When the segment size limit is reached, it closes the segment and opens a new one that becomes active. One partition can have one or more segment based on the configuration.

每个细分包含三个文件- segment.log,segment.index和segment.timeindex

Each segment contains three files - segment.log,segment.index and segment.timeindex

每个Kafka主题分区有三种文件类型:

There are three types of file for each Kafka topic partition:

-rw-r--r-- 1 kafka hadoop  10485760 Dec  3 23:57 00000000000000000000.index
-rw-r--r-- 1 kafka hadoop 148814230 Oct 11 06:50 00000000000000000000.log
-rw-r--r-- 1 kafka hadoop  10485756 Dec  3 23:57 00000000000000000000.timeindex

日志和索引文件前面的 00000000000000000000 是段的名称.它表示该段中写入的第一条记录的偏移量.如果有2个段,即段1包含消息偏移量0,1,段2包含消息偏移量2和3.

The 00000000000000000000 in front of log and index files is the name of the segments. It represents the offset of the first record written in that segment. If there are 2 segments i.e. Segment 1 containing message offset 0,1 and Segment 2 containing message offset 2 and 3.

-rw-r--r-- 1 kafka hadoop  10485760 Dec  3 23:57 00000000000000000000.index
-rw-r--r-- 1 kafka hadoop 148814230 Oct 11 06:50 00000000000000000000.log
-rw-r--r-- 1 kafka hadoop  10485756 Dec  3 23:57 00000000000000000000.timeindex
-rw-r--r-- 1 kafka hadoop  10485760 Dec  3 23:57 00000000000000000002.index
-rw-r--r-- 1 kafka hadoop 148814230 Oct 11 06:50 00000000000000000002.log
-rw-r--r-- 1 kafka hadoop  10485756 Dec  3 23:57 00000000000000000002.timeindex

.log 文件存储偏移量,消息的物理位置,时间戳以及消息内容.从Kafka读取特定偏移量的消息时,在庞大的日志文件中查找偏移量将成为一项昂贵的任务.这就是文件 .index 有用的地方.它将消息的偏移量和物理位置存储在日志文件中.

.log file stores the offset, the physical position of the message, timestamp along with the message content. While reading the messages from Kafka at a particular offset, it becomes an expensive task to find the offset in a huge log file. That's where .index the file becomes useful. It stores the offsets and physical position of the messages in the log file.

.timeindex 该文件基于消息的时间戳.

.timeindex the file is based on the timestamp of messages.

这篇关于kafka数据日志目录下有哪些不同的日志的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆