使用多处理或线程加速单个任务 [英] Speed-up a single task using multi-processing or threading

查看:79
本文介绍了使用多处理或线程加速单个任务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以使用多处理/线程加速单个任务?我的直觉是答案是否定的.这是我所说的单个任务"的示例:

Is it possible to speed up a single task using multi-processing/threading? My gut feeling is that the answer is 'no'. Here is an example of what I mean by a "single task":

for i in range(max):
    pick = random.choice(['on', 'off', 'both'])

使用10000000作为参数,大约需要7.9秒才能在我的系统上完成.

With an argument of 10000000 it takes about 7.9 seconds to complete on my system.

我对如何对多个任务使用多处理和线程有基本的了解.例如,如果我有10个目录,每个目录包含X个需要读取的文件,则可以使用create 10个线程.

I have a basic grasp of how to use multi-processing and threading for multiple tasks. For example, if I have 10 directories each one containing X number of files that need to be read, I could use create 10 threads.

我怀疑单个任务仅使用单个进程(任务管理器报告CPU使用率最小).在这种情况下,有没有办法利用我的其他核心?还是增加CPU/内存速度是获得更快结果的唯一方法?

I suspect that the single task is using only a single process (task manager reports CPU usage is minimal). Is there a way to leverage my other cores in such cases? Or is increasing the CPU/Memory speeds the only way to get faster results?

推荐答案

以下是使用和不使用多处理的代码基准:

Here is a benchmark of your code with and without multiprocessing:

#!/usr/bin/env python

import random
import time

def test1():
    print "for loop with no multiproc: "
    m = 10000000
    t = time.time()
    for i in range(m):
        pick = random.choice(['on', 'off', 'both'])
    print time.time()-t

def test2():
    print "map with no multiproc: "
    m = 10000000
    t = time.time()
    map(lambda x: random.choice(['on', 'off', 'both']), range(m))
    print time.time()-t

def rdc(x):
    return random.choice(['on', 'off', 'both'])

def test3():
    from multiprocessing import Pool

    pool = Pool(processes=4)
    m = 10000000

    print "map with multiproc: "
    t = time.time()

    r = pool.map(rdc, range(m))
    print time.time()-t

if __name__ == "__main__":
    test1()
    test2()
    test3()

这是我的工作站(四核)上的结果:

And here is the result on my workstation (which is a quadcore):

for loop with no multiproc: 
8.31032013893
map with no multiproc: 
9.48167610168
map with multiproc: 
4.94983720779

是否可以使用多处理/线程加速单个任务?我的直觉是答案是否定的.

Is it possible to speed up a single task using multi-processing/threading? My gut feeling is that the answer is 'no'.

好吧,当然,答案是该死,是的".

well, afaict, the answer is "damn, yes".

在这种情况下,有没有办法利用我的其他核心?还是增加CPU/内存速度是获得更快结果的唯一方法?

Is there a way to leverage my other cores in such cases? Or is increasing the CPU/Memory speeds the only way to get faster results?

是,通过使用多重处理.由于GIL,Python无法使用线程处理多个内核,但它可以依赖于操作系统的调度程序来利用其他内核.然后,您可以对自己的任务进行真实改进.

yes, by using multiprocessing. Python can't handle multiple cores by using threading, because of the GIL, but it can rely on your operating system's scheduler to leverage the other cores. Then you can get a real improvement on your tasks.

这篇关于使用多处理或线程加速单个任务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆