为什么 Apache Flink 应用程序的并行执行比顺序执行慢? [英] Why is the parallel execution of an Apache Flink application slower than the sequential execution?

查看:33
本文介绍了为什么 Apache Flink 应用程序的并行执行比顺序执行慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有一个 TaskManager 和两个处理槽的 Apache Flink 设置.当我在并行度设置为 1 的情况下执行应用程序时,该作业大约需要 33 秒才能执行.当我将并行度增加到 2 时,作业需要 45 秒才能完成.

I have an Apache Flink setup with one TaskManager and two processing slots. When I execute an application with parallelism set as 1, the job takes around 33 seconds to execute. When I increase the parallelism to 2, the job takes 45 seconds to complete.

我在我的 Windows 机器上使用 Flink,配置为 10 个计算内核(4C + 6G).我想用 2 个插槽获得更好的结果.我能做什么?

I am using Flink on my Windows machine with the configuration of 10 Compute Cores(4C + 6G). I want to achieve better results with 2 slots. What can I do?

推荐答案

像 Apache Flink 这样的分布式系统旨在在数百台机器的数据中心内运行.它们不是为了在单台计算机上并行计算而设计的.此外,Flink 针对的是大规模问题.在本地机器上运行几秒钟的作业不是 Flink 的主要用例.

Distributed systems like Apache Flink are designed to run in data centers on hundreds of machines. They are not designed to parallelize computations on a single computer. Moreover, Flink targets large-scale problems. Jobs that run in seconds on a local machine are not the primary use case for Flink.

并行化应用程序总是会导致开销.数据必须在进程和线程之间分布和共享.Flink 通过序列化和反序列化在 TaskManager 插槽之间分发数据.而且,启动和协调分布式任务也不是免费的.

Parallelizing an application always causes overhead. Data has to be distributed and shared between processes and threads. Flink distributes data across TaskManager slots by serializing and deserializing it. Moreover, starting and coordinating distributed tasks also does not come for free.

在单台机器上用分布式系统扩展小规模问题时,观察到更长的执行时间并不奇怪.您可以将应用程序移植到利用共享内存的线程并行应用程序.

It is not surprising to observe longer execution times when scaling a small-scale problem with a distributed system on a single machine. You could port the application to a thread-parallel application that leverages shared memory.

这篇关于为什么 Apache Flink 应用程序的并行执行比顺序执行慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆