为什么Dask的执行速度如此之慢,而多处理的执行速度却如此之快? [英] Why does Dask perform so slower while multiprocessing perform so much faster?

查看:336
本文介绍了为什么Dask的执行速度如此之慢,而多处理的执行速度却如此之快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了更好地了解并行,我正在比较一组不同的代码.

To get a better understanding about parallel, I am comparing a set of different pieces of code.

这是基本的代码(code_piece_1).

Here is the basic one (code_piece_1).

import time

# setup
problem_size = 1e7
items = range(9)

# serial
def counter(num=0):
    junk = 0
    for i in range(int(problem_size)):
        junk += 1
        junk -= 1
    return num

def sum_list(args):
    print("sum_list fn:", args)
    return sum(args)

start = time.time()
summed = sum_list([counter(i) for i in items])
print(summed)
print('for loop {}s'.format(time.time() - start))

这段代码以串行方式(for循环)运行了一个时间消耗者,并得到了结果

This code ran a time consumer in a serial style (for loop) and got this result

sum_list fn: [0, 1, 2, 3, 4, 5, 6, 7, 8]
36
for loop 8.7735116481781s

多处理

可以将多处理方式视为实现并行计算的一种方式吗?

multiprocessing

Could multiprocessing style be viewed as a way to implement parallel computing?

我认为是,因为 doc 如此表示.

I assume a Yes, since the doc says so.

这是code_piece_2

Here is code_piece_2

import multiprocessing
start = time.time()
pool = multiprocessing.Pool(len(items))
num_to_sum = pool.map(counter, items)
print(sum_list(num_to_sum))
print('pool.map {}s'.format(time.time() - start))

此代码以多处理方式同时运行消费者,并获得了结果

This code ran the same time consumer in multiprocessing style and got this result

sum_list fn: [0, 1, 2, 3, 4, 5, 6, 7, 8]
36
pool.map 1.6011056900024414s

很明显,在这种特殊情况下,多处理程序比串行处理要快.

Obviously, the multiprocessing one is faster than the serial in this particular case.

Dask 是用于Python中并行计算的灵活库.

Dask is a flexible library for parallel computing in Python.

此代码(code_piece_3)与使用者同时运行达斯克(我不确定我是否以正确的方式使用达斯克).

This code (code_piece_3) ran the same time consumer with Dask (I am not sure whether I use Dask the right way.)

@delayed
def counter(num=0):
    junk = 0
    for i in range(int(problem_size)):
        junk += 1
        junk -= 1
    return num
@delayed
def sum_list(args):
    print("sum_list fn:", args)
    return sum(args)

start = time.time()
summed = sum_list([counter(i) for i in items])
print(summed.compute())
print('dask delayed {}s'.format(time.time() - start))

我知道了

sum_list fn: [0, 1, 2, 3, 4, 5, 6, 7, 8]
36
dask delayed 10.288054704666138s

我的CPU有6个物理核心

my cpu has 6 physical cores

为什么Dask的执行速度如此之慢,而多处理的执行速度却如此之快?

Why does Dask perform so slower while multiprocessing perform so much faster?

我使用Dask的方式错误吗?如果是,正确的方法是什么?

Am I using Dask the wrong way? If yes, what is the right way?

注意:请讨论此特定案例或其他特定具体案例.请不要一般说话.

Note: Please discuss with this particular case or other specific and concrete cases. Please do NOT talk generally.

推荐答案

在您的示例中,dask比python多处理要慢,因为您没有指定调度程序,所以dask使用多线程后端,这是默认设置.正如mdurant所指出的那样,您的代码不会释放GIL,因此多线程无法并行执行任务图.

In your example, dask is slower than python multiprocessing, because you don't specify the scheduler, so dask uses the multithreading backend, which is the default. As mdurant has pointed out, your code does not release the GIL, therefore multithreading cannot execute the task graph in parallel.

在这里查看有关该主题的完整概述: https://docs.dask.org/en/stable/scheduler-overview.html

Have a look here for a good overview over the topic: https://docs.dask.org/en/stable/scheduler-overview.html

对于您的代码,您可以通过调用以下命令切换到多处理后端: .compute(scheduler='processes').

For your code, you could switch to the multiprocessing backend by calling: .compute(scheduler='processes').

如果使用多处理后端,则进程之间的所有通信仍需要通过主进程.因此,您可能还需要检出分布式调度程序,在该调度程序中工作进程可以直接相互通信,这对于复杂的任务图尤其有用.此外,分布式调度程序支持工作窃取以平衡流程之间的工作,并具有Web界面,可提供有关正在运行的任务的某些诊断信息.即使您只想在本地计算机上进行计算,使用分布式调度程序而不是多处理调度程序通常也很有意义.

If you use the multiprocessing backend, all communication between processes still needs to pass through the main process. You therefore might also want to check out the distributed scheduler, where worker processes can directly communicate with each other, which is beneficial especially for complex task graphs. Also, the distributed scheduler supports work-stealing to balance work between processes and has a webinterface providing some diagnostic information about running tasks. It often makes sense to use the distributed scheduler rather than the multirpocessing scheduler even if you only want to compute on a local machine.

这篇关于为什么Dask的执行速度如此之慢,而多处理的执行速度却如此之快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆