具有单一功能的Python多处理 [英] Python Multiprocessing with a single function

查看:143
本文介绍了具有单一功能的Python多处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个目前正在运行的模拟器,但ETA大概有40个小时 - 我试图通过多处理加速它。



它基本上迭代一个变量(L)的3个值和第二个变量(a)的99个值。使用这些值,它本质上运行一个复杂的模拟并返回9个不同的标准偏差。因此(即使我还没有这样编码),它本质上是一个函数,它将两个值作为输入(L,a)并返回9个值。



这里是我有的代码的本质:

  STD_1 = [] 
STD_2 = []
#等。

为范围内的L(0,6,2):
为范围内(1,100):
###模拟代码###
STD_1.append(value_1)
STD_2.append(value_2)
#etc.



  master_list = [] 

def模拟(a,L):
###模拟代码###
返回(a,L,STD_1,STD_2等)

,2):
(1,100):
master_list.append(模拟(a,L))

由于每个模拟都是独立的,所以它似乎是实现某种多线程/处理的理想场所。

我将如何去合作这是什么?



编辑:另外,一切都会按顺序返回到主列表,或者如果多个进程正在工作,它可能会失序?



编辑2:这是我的代码 - 但它无法正常运行。它询问我是否在运行后立即杀死程序。

 导入多处理

数据= []

为范围(0,6,2)中的L:
为范围内(1,100):
data.append((L,a))

print(data)

def simulation(arg):
#解开元组
a = arg [1]
L = arg [ 0]
STD_1 = a ** 2
STD_2 = a ** 3
STD_3 = a ** 4
#模拟代码#
返回((STD_1,STD_2 ,STD_3))

print(1)

p = multiprocessing.Pool()

print(2)

results = p.map(simulation,data)

编辑3:还有什么多处理的局限性。我听说它在OS X上不起作用,这是正确的吗?

将每次迭代的数据包装到一个元组中。
  • 创建这些元组的列表 data

  • 编写一个函数 f 来处理一个元组并返回一个结果

  • 创建 p = multiprocessing。 Pool() object。

  • 调用 results = p.map(f,data)



  • 由于您的机器在不同的进程中具有核心,因此它将运行许多 f 实例。



    Edit1:示例:

     从多处理进口池

    data = [('bla',1,3,7),('spam',12,4,8),('eggs',17,1,3) ]

    def f(t):
    名称,a,b,c = t
    返回(名称,a + b + c)

    p = Pool()
    results = p.map(f,data)
    打印结果

    Edit2:



    多重处理应该可以在类似UNIX的平台(如OSX)上正常工作。只有缺乏 os.fork (主要是MS Windows)的平台需要特别关注。但即使如此,它仍然有效。查看多处理文档。


    I have a simulation that is currently running, but the ETA is about 40 hours -- I'm trying to speed it up with multi-processing.

    It essentially iterates over 3 values of one variable (L), and over 99 values of of a second variable (a). Using these values, it essentially runs a complex simulation and returns 9 different standard deviations. Thus (even though I haven't coded it that way yet) it is essentially a function that takes two values as inputs (L,a) and returns 9 values.

    Here is the essence of the code I have:

    STD_1 = []
    STD_2 = []
    # etc.
    
    for L in range(0,6,2):
        for a in range(1,100):
            ### simulation code ###
            STD_1.append(value_1)
            STD_2.append(value_2)
            # etc.
    

    Here is what I can modify it to:

    master_list = []
    
    def simulate(a,L):
        ### simulation code ###
        return (a,L,STD_1, STD_2 etc.)
    
    for L in range(0,6,2):
        for a in range(1,100): 
            master_list.append(simulate(a,L))
    

    Since each of the simulations are independent, it seems like an ideal place to implement some sort of multi-threading/processing.

    How exactly would I go about coding this?

    EDIT: Also, will everything be returned to the master list in order, or could it possibly be out of order if multiple processes are working?

    EDIT 2: This is my code -- but it doesn't run correctly. It asks if I want to kill the program right after I run it.

    import multiprocessing
    
    data = []
    
    for L in range(0,6,2):
        for a in range(1,100):
            data.append((L,a))
    
    print (data)
    
    def simulation(arg):
        # unpack the tuple
        a = arg[1]
        L = arg[0]
        STD_1 = a**2
        STD_2 = a**3
        STD_3 = a**4
        # simulation code #
        return((STD_1,STD_2,STD_3))
    
    print("1")
    
    p = multiprocessing.Pool()
    
    print ("2")
    
    results = p.map(simulation, data)
    

    EDIT 3: Also what are the limitations of multiprocessing. I've heard that it doesn't work on OS X. Is this correct?

    解决方案

    • Wrap the data for each iteration up into a tuple.
    • Make a list data of those tuples
    • Write a function f to process one tuple and return one result
    • Create p = multiprocessing.Pool() object.
    • Call results = p.map(f, data)

    This will run as many instances of f as your machine has cores in separate processes.

    Edit1: Example:

    from multiprocessing import Pool
    
    data = [('bla', 1, 3, 7), ('spam', 12, 4, 8), ('eggs', 17, 1, 3)]
    
    def f(t):
        name, a, b, c = t
        return (name, a + b + c)
    
    p = Pool()
    results = p.map(f, data)
    print results
    

    Edit2:

    Multiprocessing should work fine on UNIX-like platforms such as OSX. Only platforms that lack os.fork (mainly MS Windows) need special attention. But even there it still works. See the multiprocessing documentation.

    这篇关于具有单一功能的Python多处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆