Kafka Streams 用于计数总数? [英] Kafka Streams for count a total num?

查看:24
本文介绍了Kafka Streams 用于计数总数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个名为addcash"的topic,有3个分区(kafka集群的机器也是3个),里面流着很多用户充值的消息.我想每天数一数总钱数.我从一些关于 Kafka Streams 的文章中了解到:Kafka Streams 将拓扑作为任务运行,任务的数量取决于主题的分区数量,每个任务都有单独的状态存储.那么当我按状态计算总金额时,是否有三个值,而不是一个总值会返回?正确的做法是什么?谢谢!

A topic named "addcash" which has 3 partitions(the number of the kafka cluster machines is 3 too), and a lot of user recharge messages flow in it. I want to count the total money num everyday. I learned from some articles about Kafka Streams: The Kafka Streams will run the topology as task, and the number of the task depend on the number of the topic's partitions, and every task has individual state store. So when I count the total money num by state stroe, Is there three values, not a total value will be return? What is the right way to do it? Thanks!

推荐答案

正确.

您有两种方法可以做到这一点:

You have two ways to do this:

  1. 您进行部分求和,然后跟进 KTable.groupBy(...).reduce(...) 并设置一个全局键以获取所有部分聚合一起.

  1. You do the partial sums, and that a follow up KTable.groupBy(...).reduce(...) and set a single global key to bring all partial aggregates together.

您可以通过创建一个额外的单分区主题来获得总和,将部分结果写入该主题,使用 KafkaStreams 读回数据并进行第二次聚合,将这些部分数字加在一起.您可以通过使用 through("my-single-partition-topic"); 连接聚合的第一和第二部分的单个程序来表达这一点.您需要使用 transform() 而不是 DSL 来执行此解决方案的第二个聚合步骤.

You can get the total sum by creating an additional single-partitions topic, write the partial results into this topic, read the data back with KafkaStreams and do a second aggregation that add those partial numbers together. You can express this with a single program using through("my-single-partition-topic"); to connect the first and second part of the aggregation. You would need to use a transform() but not DSL to do the second aggregation step for this solution.

这篇关于Kafka Streams 用于计数总数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆