为什么multiprocessing.Pool.map比内置map慢? [英] Why is multiprocessing.Pool.map slower than builtin map?

查看:398
本文介绍了为什么multiprocessing.Pool.map比内置map慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

import multiprocessing
import time
from subprocess import call,STDOUT
from glob import glob
import sys


def do_calculation(data):
    x = time.time()
    with open(data + '.classes.report','w') as f:
        call(["external script", data], stdout = f.fileno(), stderr=STDOUT)
    return 'apk: {data!s} time {tim!s}'.format(data = data ,tim = time.time()-x)


def start_process():
    print 'Starting', multiprocessing.current_process().name

if __name__ == '__main__':

    inputs = glob('./*.dex')


    builtin_outputs = map(do_calculation, inputs)
    print 'Built-in:'
    for i in builtin_outputs:
        print i

    pool_size = multiprocessing.cpu_count() * 2
    print 'Worker Pool size: %s' % pool_size
    pool = multiprocessing.Pool(processes=pool_size,
                                initializer=start_process,
                                )
    pool_outputs = pool.map(do_calculation, inputs)
    pool.close() # no more tasks
    pool.join()  # wrap up current tasks

    print 'Pool output:'
    for i in pool_outputs:
        print i

令人惊讶的是,builtin_outputs的执行时间比pool_outputs快:

Surprisingly, builtin_outputs has a faster execution time than pool_outputs:

Built-in:
apk: ./TooDo_2.0.8.classes.dex time 5.69289898872
apk: ./TooDo_2.0.9.classes.dex time 5.37206411362
apk: ./Twitter_Client.classes.dex time 0.272782087326
apk: ./zaTelnet_Light.classes.dex time 0.141801118851
apk: ./Temperature_Converter.classes.dex time 0.270312070847
apk: ./Tipper_1.0.classes.dex time 0.293262958527
apk: ./XLive.classes.dex time 0.361288070679
apk: ./TwitterDroid_0.1.2_alpha.classes.dex time 0.381947040558
apk: ./Universal_Conversion_Application.classes.dex time 0.404763936996

Worker Pool size: 8

Pool output:
apk: ./TooDo_2.0.8.classes.dex time 5.72440505028
apk: ./TooDo_2.0.9.classes.dex time 5.9017829895
apk: ./Twitter_Client.classes.dex time 0.309305906296
apk: ./zaTelnet_Light.classes.dex time 0.374011039734
apk: ./Temperature_Converter.classes.dex time 0.450366973877
apk: ./Tipper_1.0.classes.dex time 0.379780054092
apk: ./XLive.classes.dex time 0.394504070282
apk: ./TwitterDroid_0.1.2_alpha.classes.dex time 0.505702018738
apk: ./Universal_Conversion_Application.classes.dex time 0.512043952942

如何解释这种性能差异?

推荐答案

如果外部脚本"中涉及的工作负载足够IO,以至于使硬盘饱和,那么并行运行多个副本只会减慢您的速度,因为从多个文件中读取会导致其他搜寻.

If the workload involved in "external script" is sufficiently IO-heavy that it saturates your hard disk, running multiple copies in parallel will only slow you down, as reading from multiple files incurs additional seeks.

如果您的CPU饱和并且没有可用的多个CPU内核,则同样如此.

Same goes if you're saturating your CPU and you don't have multiple CPU cores available.

这篇关于为什么multiprocessing.Pool.map比内置map慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆