在 kafka 中读取特定时间戳的消息 [英] Reading messages for specific timestamp in kafka

查看:102
本文介绍了在 kafka 中读取特定时间戳的消息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 kafka 中读取从特定时间开始的所有消息.假设我想阅读 0600 到 0800 之间的所有消息

I want to read all the messages starting from a specific time in kafka. Say I want to read all messages between 0600 to 0800

从 Kafka 请求两个时间戳之间的消息建议使用 offsetsForTimes 作为解决方案.

Request messages between two timestamps from Kafka suggests the solution as the usage of offsetsForTimes.

该解决方案的问题是:如果说我的消费者每天在 1300 开启.消费者当天不会阅读任何消息,这实际上意味着在 0600 时/之后没有提交偏移量,这意味着 offsetsForTimes(< partitionname > , <0600 for that day in毫秒>) 将返回空值.

Problem with that solution is : If say my consumer is switched on everyday at 1300. The consumer would not have read any messages that day, which effectively means no offset was committed at/after 0600, which means offsetsForTimes(< partitionname > , <0600 for that day in millis>) will return null.

有什么方法可以读取在特定时间发布到 kafka 队列的消息,而不管偏移量如何?

Is there any way I can read a message which was published to kafka queue at a certain time, irrespective of offsets?

推荐答案

offsetsForTimes() 返回为请求时间生成的消息的偏移量.无论是否提交偏移量,它都可以工作,因为偏移量是直接从分区日志中获取的.

offsetsForTimes() returns offsets of messages that were produced for the requested time. It works regardless if offsets were committed or not because the offsets are directly fetched from the partition logs.

所以是的,您应该使用此方法查找 0600 之后产生的第一个偏移量,寻找该位置并使用消息,直到到达 0800.

So yes you should be using this method to find the first offset produced after 0600, seek to that position and consume messages until you reach 0800.

这篇关于在 kafka 中读取特定时间戳的消息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆