Kafka Stream StateStore 在所有实例上是全局的还是本地的? [英] Is Kafka Stream StateStore global over all instances or just local?

查看:29
本文介绍了Kafka Stream StateStore 在所有实例上是全局的还是本地的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Kafka Stream WordCount 示例中,它使用 StateStore 来存储字数.如果同一个消费者组中有多个实例,StateStore 对组来说是全局的,还是只是一个消费者实例的局部?

In Kafka Stream WordCount example, it uses StateStore to store word counts. If there are multiple instances in the same consumer group, the StateStore is global to the group, or just local to an consumer instance?

感谢

推荐答案

这取决于您对 state store 的看法.

This depends on your view on a state store.

  1. 在 Kafka Streams 中,状态是共享的,因此每个实例都包含整个应用程序状态的一部分.例如,使用 DSL stateful operator 使用本地 RocksDB 实例来保存它们的状态分片.因此,在这方面,状态是本地的.

  1. In Kafka Streams a state is shared and thus each instance holds part of the overall application state. For example, using DSL stateful operator use a local RocksDB instance to hold their shard of the state. Thus, with this regard the state is local.

另一方面,对状态的所有更改都会写入 Kafka 主题.这个topic不是活"在应用主机上,而是在Kafka集群中,由多个分区组成,可以复制.如果出现错误,此更改日志主题用于在另一个仍在运行的实例中重新创建失败实例的状态.因此,由于所有应用程序实例都可以访问变更日志,因此也可以将其视为全局的.

On the other hand, all changes to the state are written into a Kafka topic. This topic does not "live" on the application host but in the Kafka cluster and consists of multiple partition and can be replicated. In case of an error, this changelog topic is used to recreate the state of the failed instance in another still running instance. Thus, as the changelog is accessible by all application instances, it can be considered to be global, too.

请记住,变更日志是应用程序状态的真相,本地存储基本上是状态分片的缓存.

Keep in mind, that the changelog is the truth of the application state and the local stores are basically caches of shards of the state.

此外,在 WordCount 示例中,记录流(数据流)按单词进行分区,这样一个单词的计数将由单个实例维护(不同的实例维护不同单词的计数).

Moreover, in the WordCount example, a record stream (the data stream) gets partitioned by words, such that the count of one word will be maintained by a single instance (and different instances maintain the counts for different words).

对于架构概述,我推荐 http://docs.confluent.io/current/streams/architecture.html

For an architectural overview, I recommend http://docs.confluent.io/current/streams/architecture.html

这篇博文也应该很有趣 http://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/

Also this blog post should be interesting http://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/

这篇关于Kafka Stream StateStore 在所有实例上是全局的还是本地的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆