Kafka Streams可以计算总数吗? [英] Kafka Streams for count a total num?

查看:179
本文介绍了Kafka Streams可以计算总数吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个名为"addcash"的主题,它具有3个分区(kafka集群计算机的数量也为3),并且其中充斥着许多用户充值消息.我想每天计算总数. 我从一些有关Kafka Streams的文章中学到了:Kafka Streams将作为任务运行拓扑,并且任务的数量取决于主题分区的数量,并且每个任务都有单独的状态存储. 因此,当我按州类别计算总金额时,是否存在三个值,而不是总值会返回?正确的做法是什么? 谢谢!

A topic named "addcash" which has 3 partitions(the number of the kafka cluster machines is 3 too), and a lot of user recharge messages flow in it. I want to count the total money num everyday. I learned from some articles about Kafka Streams: The Kafka Streams will run the topology as task, and the number of the task depend on the number of the topic's partitions, and every task has individual state store. So when I count the total money num by state stroe, Is there three values, not a total value will be return? What is the right way to do it? Thanks!

推荐答案

那是正确的.

您有两种方法可以做到这一点:

You have two ways to do this:

  1. 您要进行部分求和,然后进行跟进KTable.groupBy(...).reduce(...)并设置单个全局密钥以将所有部分合计组合在一起.

  1. You do the partial sums, and that a follow up KTable.groupBy(...).reduce(...) and set a single global key to bring all partial aggregates together.

您可以通过创建其他单分区主题来获得总和,将部分结果写入该主题,使用KafkaStreams读回数据,然后进行第二次聚合以将这些部分数字加在一起.您可以使用through("my-single-partition-topic");通过单个程序将其表达出来,以连接聚合的第一部分和第二部分.您需要使用transform()而不是DSL来执行此解决方案的第二个聚合步骤.

You can get the total sum by creating an additional single-partitions topic, write the partial results into this topic, read the data back with KafkaStreams and do a second aggregation that add those partial numbers together. You can express this with a single program using through("my-single-partition-topic"); to connect the first and second part of the aggregation. You would need to use a transform() but not DSL to do the second aggregation step for this solution.

这篇关于Kafka Streams可以计算总数吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆