使用Python的multiprocessing.pool.map操纵相同的整数 [英] Using Python's multiprocessing.pool.map to manipulate the same integer

查看:334
本文介绍了使用Python的multiprocessing.pool.map操纵相同的整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题

我使用Python的多模块异步执行的功能。我想要做的是能够跟踪我的剧本的全面进步,因为每个进程调用并执行 DEF add_print 。举例来说,我想code以下加1 并打印出值( 1 2 3 ... 18 19 20 )每次进程运行该功能。我第一次尝试是使用全局变量,但这并没有工作。由于该功能被异步调用的,每一个进程读 0,以开始,并独立加1其他进程。所以输出是20的递增值,而不是 1

我怎么能去以同步的方式从我的映射函数引用的内存块一样,即使功能正在异步运行?我有一个想法就是以某种方式缓存在内存中,然后引用的内存确切的块,当我加入 。这是蟒蛇可能从根本上完善的方法呢?

请让我还是知道,如果你需要的信息了,如果我没有解释的东西不够好。

谢谢!


code

 #!的/ usr / bin中/蟒蛇##导入内建
从多进口游泳池总= 0高清add_print(NUM):
    全球总量
    总+ = 1
    打印总
如果__name__ ==__main__:
    NUMS =范围(20)    池=池(进程= 20)
    pool.map(add_print,NUMS)


解决方案

您可以使用的共享

 进口多为MP高清add_print(NUM):
    total.value + = 1
    打印(total.value)DEF设置(T):
    全球总量
    总= T如果__name__ ==__main__:
    总= mp.Value('我',0)
    NUMS =范围(20)
    池= mp.Pool(初始化=安装,initargs = [总])
    pool.map(add_print,NUMS)

池初始化调用设置一次为每个工人子。 设置
工作进程中一个全局变量,所以
内访问 add_print 当工人要求 add_print

请注意,进程数不应超过你的机器拥有的CPU数量。如果你这样做,多余的子进程将等待绕了CPU来变得可用。所以不要使用工艺= 20 除非你有20个或更多的CPU。如果你不提供流程参数,多处理将检测可用CPU的数量和产卵与池许多工人为您服务。任务数(例如 NUMS长度),通常大大超过CPU的数目。没关系;的任务是由工人作为工人变为可用的一个排队和处理

Problem

I'm using Python's multiprocessing module to execute functions asynchronously. What I want to do is be able to track the overall progress of my script as each process calls and executes def add_print. For instance, I would like the code below to add 1 to total and print out the value (1 2 3 ... 18 19 20) every time a process runs that function. My first attempt was to use a global variable but this didn't work. Since the function is being called asynchronously, each process reads total as 0 to start off, and adds 1 independently of other processes. So the output is 20 1's instead of incrementing values.

How could I go about referencing the same block of memory from my mapped function in a synchronous manner, even though the function is being run asynchronously? One idea I had was to somehow cache total in memory and then reference that exact block of memory when I add to total. Is this a possible and fundamentally sound approach in python?

Please let me know if you need anymore info or if I didn't explain something well enough.

Thanks!


Code

#!/usr/bin/python

## Import builtins
from multiprocessing import Pool 

total = 0

def add_print(num):
    global total
    total += 1
    print total


if __name__ == "__main__":
    nums = range(20)

    pool = Pool(processes=20)
    pool.map(add_print, nums)

解决方案

You could use a shared Value:

import multiprocessing as mp

def add_print(num):
    total.value += 1
    print(total.value)

def setup(t):
    global total
    total = t

if __name__ == "__main__":
    total = mp.Value('i', 0)       
    nums = range(20)
    pool = mp.Pool(initializer=setup, initargs=[total])
    pool.map(add_print, nums)

The pool initializer calls setup once for each worker subprocess. setup makes total a global variable in the worker process, so total can be accessed inside add_print when the worker calls add_print.

Note, the number of processes should not exceed the number of CPUs your machine has. If you do, the excess subprocesses will wait around for a CPUs to become available. So don't use processes=20 unless you have 20 or more CPUs. If you don't supply a processes argument, multiprocessing will detect the number of CPUs available and spawn a pool with that many workers for you. The number of tasks (e.g. the length of nums) usually greatly exceeds the number of CPUs. That's fine; the tasks are queued and processed by one of the workers as a worker becomes available.

这篇关于使用Python的multiprocessing.pool.map操纵相同的整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆