Kafka Streams 线程数 [英] Kafka Streams thread number

查看:45
本文介绍了Kafka Streams 线程数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Kafka Streams 的新手,我目前对 Kafka Streams 应用程序的最大并行度感到困惑.我浏览了以下链接,但没有得到我想要找到的答案.https://docs.confluent.io/current/streams/faq.html#streams-faq-scalability-maximum-parallelism

I am new to Kafka Streams, I am currently confused with the maximum parallelism of Kafka Streams application. I went through following link and did not get the answer what I am trying to find. https://docs.confluent.io/current/streams/faq.html#streams-faq-scalability-maximum-parallelism

如果我有 2 个输入主题,一个有 10 个分区,另一个有 5 个分区,并且只有一个 Kafka Streams 应用程序实例正在运行来处理这两个输入主题,那么在这种情况下我可以拥有的最大线程数是多少?10 还是 15?

If I have 2 input topics, one have 10 partitions and the other have 5 partitions, and only one Kafka Streams application instance is running to process these two input topics, what is the maximum thread number I can have in this case? 10 or 15?

推荐答案

如果我有 2 个输入主题,一个有 10 个分区,另一个有 5 个分区

If I have 2 input topics, one have 10 partitions and the other have 5 partitions

听起来不错.所以你总共有 15 个分区.假设您有一个简单的处理器拓扑结构,没有连接和聚合,因此所有 15 个分区都在进行无状态转换.

Sounds good. So you have 15 total partitions. Let's assume you have a simple processor topology, without joins and aggregations, so that all 15 partitions are just being statelessly transformed.

然后,15 个输入分区中的每一个都将映射到一个 Kafka Streams任务".如果您有 1 个线程,则来自这 15 个任务的输入将由该 1 个线程处理.如果您有 15 个线程,则每个任务都有一个专用线程来处理其输入.因此,您可以使用 15 个线程运行 1 个应用程序或使用 1 个线程运行 15 个应用程序,这在逻辑上是相似的:您在 15 个线程中处理 15 个任务.唯一的区别是 15 个应用程序和 1 个线程允许您跨 JVM 分散负载.

Then, each of the 15 input partitions will map to a single a Kafka Streams "task". If you have 1 thread, input from these 15 tasks will be processed by that 1 thread. If you have 15 threads, each task will have a dedicated thread to handle its input. So you can run 1 application with 15 threads or 15 applications with 1 thread and it's logically similar: you process 15 tasks in 15 threads. The only difference is that 15 applications with 1 thread allows you to spread your load over across JVMs.

同样,如果您启动应用程序的 15 个实例,每个实例有 1 个线程,那么每个应用程序将被分配 1 个任务,每个应用程序中的每个 1 个线程将处理其给定的 1 个任务.

Likewise, if you start 15 instances of the application, each instance with 1 thread, then each application will be assigned 1 task, and each 1 thread in each application will handle its given 1 task.

在这种情况下我可以拥有的最大线程数是多少?10 还是 15?

what is the maximum thread number I can have in this case? 10 or 15?

您可以将最大线程数设置为任何值.如果您跨所有任务的线程数超过任务总数,则某些线程将保持空闲.

You can set your maximum thread count to anything. If your thread count across all tasks exceeds the total number of tasks, then some of the threads will remain idle.

我推荐阅读https://docs.confluent.io/current/streams/architecture.html#parallelism-model,如果您还没有.此外,研究您的应用程序在启动时生成的日志.每个线程记录它分配的任务,如下所示:

I recommend reading https://docs.confluent.io/current/streams/architecture.html#parallelism-model, if you haven't yet. Also, study the logs your application produces when it starts up. Each thread logs the tasks it gets assigned, like this:

[2018-01-04 16:45:26,859] INFO (org.apache.kafka.streams.processor.internals.StreamThread:351) stream-thread [entities-eb9c0a9b-ecad-48c1-b4e8-715dcf2afef3-StreamThread-3] partition assignment took 110 ms.
current active tasks: [0_0, 0_2, 1_2, 2_2, 3_2, 4_2, 5_2, 6_2, 7_2, 8_2, 9_2, 10_2, 11_2, 12_2, 13_2, 14_2]
current standby tasks: []
previous active tasks: []

这篇关于Kafka Streams 线程数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆