偏移量存储在 Zookeeper 或 Kafka 中? [英] Offsets stored in Zookeeper or Kafka?

查看:41
本文介绍了偏移量存储在 Zookeeper 或 Kafka 中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对使用 Kafka 和 Zookeeper 时偏移的存储位置有点困惑.在某些情况下,偏移量似乎存储在 Zookeeper 中,而在其他情况下,它们存储在 Kafka 中.

I'm a bit confused about where offsets are stored when using Kafka and Zookeeper. It seems like offsets in some cases are stored in Zookeeper, in other cases they are stored in Kafka.

是什么决定了偏移量是存储在Kafka还是Zookeeper中?以及利弊是什么?

What determines whether the offset is stored in Kafka or in Zookeeper? And what the pros and cons?

注意:当然,我也可以将偏移量单独存储在一些不同的数据存储中,但这不是本文图片的一部分.

NB: Of course I could also store the offset on my own in some different data store but that is not part of the picture for this post.

有关我的设置的更多详细信息:

Some more details about my setup:

  • 我运行以下版本:KAFKA_VERSION="0.10.1.0"、SCALA_VERSION="2.11"
  • 我使用 NodeJS 应用程序中的 kafka-node 连接到 Kafka/Zookeeper.

推荐答案

较旧版本的 Kafka(0.9 之前的版本)仅在 ZK 中存储偏移量,而较新版本的 Kafka 默认将偏移量存储在名为 的内部 Kafka 主题中__consumer_offsets(尽管新版本可能仍会使用 ZK).

Older versions of Kafka (pre 0.9) store offsets in ZK only, while newer version of Kafka, by default store offsets in an internal Kafka topic called __consumer_offsets (newer version might still commit to ZK though).

向代理提交偏移量的优点是,消费者不依赖于 ZK,因此客户端只需要与代理交谈,从而简化了整体架构.此外,对于拥有大量消费者的大型部署,ZK 可能成为瓶颈,而 Kafka 可以轻松处理此负载(提交偏移量与写入主题相同,并且 Kafka 在这里扩展得很好——事实上,默认情况下 __consumer_offsets 是用 50 个分区 IIRC 创建的.

The advantage of committing offsets to the broker is, that the consumer does not depend on ZK and thus clients only need to talk to brokers which simplifies the overall architecture. Also, for large deployments with a lot of consumers, ZK can become a bottleneck while Kafka can handle this load easily (committing offsets is the same thing as writing to a topic and Kafka scales very well here -- in fact, by default __consumer_offsets is created with 50 partitions IIRC).

我不熟悉 NodeJS 或 kafka-node——它取决于客户端实现如何提交偏移量.

I am not familiar with NodeJS or kafka-node -- it depend on the client implementation how offsets are committed.

长话短说:如果您使用代理 0.10.1.0,您可以向主题 __consumer_offsets 提交偏移量.但是这取决于你的客户端,如果它实现了这个协议.

Long story short: if you use brokers 0.10.1.0 you could commit offsets to topic __consumer_offsets. But it depends on your client, if it implements this protocol.

更详细地说,这取决于您的代理和客户端版本(以及您使用的消费者 API),因为旧客户端可以与新代理通信.首先,您需要具有 0.9 或更高版本的代理和客户端才能将偏移量写入 Kafka 主题.但是如果旧客户端连接到 0.9 代理,它仍然会向 ZK 提交偏移量.

In more detail, it depends on your broker and client version (and which consumer API you are using), because older clients can talk to newer brokers. First, you need to have broker and client version 0.9 or larger to be able to write offsets into the Kafka topics. But if an older client is connecting to a 0.9 broker, it will still commit offsets to ZK.

对于 Java 消费者:

For Java consumers:

这取决于消费者使用什么:0.9之前有两个老消费者",即高级消费者"和低级消费者".两者都直接向 ZK 提交偏移量.从 0.9 开始,两个消费者合并为一个消费者,称为新消费者"(它基本上统一了旧消费者的低级和高级 API——这意味着,在 0.9> 有三种类型的消费者).新消费者向brokers提交offset(即Kafka内部topic)

It depends what consumer are using: Before 0.9 there are two "old consumer" namely "high level consumer" and "low level consumer". Both, commit offsets directly to ZK. Since 0.9, both consumers got merged into single consumer, called "new consumer" (it basically unifies low level and high level API of both old consumers -- this means, in 0.9 there a three types of consumers). The new consumer commits offset to the brokers (ie, the internal Kafka topic)

为了使升级更容易,还可以使用旧消费者(从 0.9 开始)双重提交"偏移量.如果您通过 dual.commit.enabled 启用此功能,则偏移量将提交给 ZK 和 __consumer_offsets 主题.这允许您从旧的消费者 API 切换到新的消费者 API,同时将您的偏移量从 ZK 转移到 __consumer_offsets 主题.

To make upgrading easier, there is also the possibility to "double commit" offsets using old consumer (as of 0.9). If you enable this via dual.commit.enabled, offsets are committed to ZK and the __consumer_offsets topic. This allows you to switch from old consumer API to new consumer API while moving you offsets from ZK to __consumer_offsets topic.

这篇关于偏移量存储在 Zookeeper 或 Kafka 中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆