存储在Zookeeper或Kafka中的偏移量? [英] Offsets stored in Zookeeper or Kafka?

查看:616
本文介绍了存储在Zookeeper或Kafka中的偏移量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当使用Kafka和Zookeeper时,我对存储偏移量的位置有些困惑.在某些情况下,偏移量似乎存储在Zookeeper中,而在其他情况下,它们存储在Kafka中.

I'm a bit confused about where offsets are stored when using Kafka and Zookeeper. It seems like offsets in some cases are stored in Zookeeper, in other cases they are stored in Kafka.

由什么决定偏移量存储在Kafka还是Zookeeper中?优点和缺点是什么?

What determines whether the offset is stored in Kafka or in Zookeeper? And what the pros and cons?

NB:当然,我也可以将偏移量单独存储在其他数据存储区中,但这不属于本文的图片.

NB: Of course I could also store the offset on my own in some different data store but that is not part of the picture for this post.

有关我的设置的更多详细信息:

Some more details about my setup:

  • 我运行以下版本:KAFKA_VERSION ="0.10.1.0",SCALA_VERSION ="2.11"
  • 我使用NodeJS应用程序中的kafka-node连接到Kafka/Zookeeper.

推荐答案

较旧版本的Kafka(0.9版之前)仅将偏移量存储在ZK中,而较新版本的Kafka默认情况下将偏移量存储在内部Kafka主题中,称为__consumer_offsets (尽管新版本可能仍会采用ZK).

Older versions of Kafka (pre 0.9) store offsets in ZK only, while newer version of Kafka, by default store offsets in an internal Kafka topic called __consumer_offsets (newer version might still commit to ZK though).

向代理人提交补偿的好处是,消费者不依赖ZK,因此客户只需要与代理人交谈,这简化了整个体系结构.此外,对于拥有大量用户的大型部署,ZK可能成为瓶颈,而Kafka可以轻松处理此负载(提交偏移量与编写主题是同一回事,并且Kafka在此处很好地扩展-实际上,默认情况下是使用50个分区IIRC创建的.)

The advantage of committing offsets to the broker is, that the consumer does not depend on ZK and thus clients only need to talk to brokers which simplifies the overall architecture. Also, for large deployments with a lot of consumers, ZK can become a bottleneck while Kafka can handle this load easily (committing offsets is the same thing as writing to a topic and Kafka scales very well here -- in fact, by default __consumer_offsets is created with 50 partitions IIRC).

我不熟悉NodeJS或kafka-node,它取决于客户端实现如何提交偏移量.

I am not familiar with NodeJS or kafka-node -- it depend on the client implementation how offsets are committed.

长话短说:如果您使用代理0.10.1.0,则可以将偏移量提交给主题__consumer_offsets.但是它是否实现此协议取决于您的客户端.

Long story short: if you use brokers 0.10.1.0 you could commit offsets to topic __consumer_offsets. But it depends on your client, if it implements this protocol.

更详细地讲,这取决于您的代理和客户端版本(以及您使用的是哪个消费者API),因为较旧的客户端可以与较新的代理通信.首先,您需要具有代理版本和客户端版本0.9或更高版本,才能将偏移量写入Kafka主题.但是,如果较旧的客户端连接到0.9代理,它将仍然向ZK提交偏移量.

In more detail, it depends on your broker and client version (and which consumer API you are using), because older clients can talk to newer brokers. First, you need to have broker and client version 0.9 or larger to be able to write offsets into the Kafka topics. But if an older client is connecting to a 0.9 broker, it will still commit offsets to ZK.

对于Java使用者:

这取决于消费者使用的是什么:0.9之前有两个老消费者",即高级消费者"和低级消费者".两者都直接向ZK提交偏移量.从0.9开始,两个使用者都合并为一个单一的使用者,称为新使用者"(它基本上将两个老使用者的低级API和高级API统一了-这意味着在0.9中存在三种类型的使用者).新消费者对经纪人的承诺抵消(即内部的Kafka主题)

It depends what consumer are using: Before 0.9 there are two "old consumer" namely "high level consumer" and "low level consumer". Both, commit offsets directly to ZK. Since 0.9, both consumers got merged into single consumer, called "new consumer" (it basically unifies low level and high level API of both old consumers -- this means, in 0.9 there a three types of consumers). The new consumer commits offset to the brokers (ie, the internal Kafka topic)

为使升级更容易,还可以使用旧的使用者(从0.9版本开始)两次提交"偏移量.如果通过dual.commit.enabled启用此功能,则偏移量将提交给ZK和__consumer_offsets主题.这样,您就可以将偏移量从ZK移到__consumer_offsets主题,同时从旧的使用者API切换到新的使用者API.

To make upgrading easier, there is also the possibility to "double commit" offsets using old consumer (as of 0.9). If you enable this via dual.commit.enabled, offsets are committed to ZK and the __consumer_offsets topic. This allows you to switch from old consumer API to new consumer API while moving you offsets from ZK to __consumer_offsets topic.

这篇关于存储在Zookeeper或Kafka中的偏移量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆