如何从Kafka中的旧偏移点获取数据? [英] How to get data from old offset point in Kafka?

查看:189
本文介绍了如何从Kafka中的旧偏移点获取数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Zookeeper从kafka获取数据.在这里,我总是从最后一个偏移点获取数据.有什么方法可以指定获取旧数据的偏移时间吗?

I am using zookeeper to get data from kafka. And here I always get data from last offset point. Is there any way to specify the time of offset to get old data?

有一个选项autooffset.reset.它接受最小或最大.有人可以解释什么是最小的和最大的. autooffset.reset可以帮助从旧的偏移点而不是最新的偏移点获取数据吗?

There is one option autooffset.reset. It accepts smallest or largest. Can someone please explain what is smallest and largest. Can autooffset.reset helps in getting data from old offset point instead of latest offset point?

推荐答案

使用者始终属于一个组,对于每个分区,Zookeeper都会跟踪该使用者组在分区中的进度.

The consumers belong always to a group and, for each partition, the Zookeeper keeps track of the progress of that consumer group in the partition.

要从头开始获取,您可以删除侯赛因所引用的与进度相关的所有数据

To fetch from the beginning, you can delete all the data associated with progress as Hussain refered

ZkUtils.maybeDeletePath(${zkhost:zkport}", "/consumers/${group.id}");

您还可以根据core/src/main/scala/kafka/tools/UpdateOffsetsInZK.scala中的指定,指定所需分区的偏移量

You can also specify the offset of partition you want, as specified in core/src/main/scala/kafka/tools/UpdateOffsetsInZK.scala

ZkUtils.updatePersistentPath(zkClient, topicDirs.consumerOffsetDir + "/" + partition, offset.toString)

但是偏移量不是按时间索引的,但是您知道每个分区都是一个序列.

However the offset is not time indexed, but you know for each partition is a sequence.

如果您的消息包含时间戳(请注意,该时间戳与Kafka收到消息的那一刻无关),则可以尝试创建索引器,尝试通过将偏移量增加N来逐步检索一个条目,并将元组(主题X,第2部分,偏移量100,时间戳)存储在某个地方.

If your message contains a timestamp (and beware that this timestamp has nothing to do with the moment Kafka received your message), you can try to do an indexer that attempts to retrieve one entry in steps by incrementing the offset by N, and store the tuple (topic X, part 2, offset 100, timestamp) somewhere.

当您想从指定的时间点检索条目时,可以对您的粗略索引进行二进制搜索,直到找到所需的条目并从那里获取.

When you want to retrieve entries from a specified moment in time, you can apply a binary search to your rough index until you find the entry you want and fetch from there.

这篇关于如何从Kafka中的旧偏移点获取数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆