导入使用MultiProcessing Python的模块 [英] Importing Modules that use MultiProcessing Python

查看:359
本文介绍了导入使用MultiProcessing Python的模块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望使用多处理模块来加快某些传输规划模型的运行时间。我通过正常方法尽可能地优化,但其核心是一个荒谬的并行问题。例如,执行相同的矩阵运算集,4个不同的输入集,所有独立信息。

I am looking to use the multiprocessing module to speed up the run time of some Transport Planning models. I've optimized as much as I can via 'normal' methods but at the heart of it is an absurdly parallel problem. Eg Perform the same set of matrix operations four 4 different sets of inputs, all independent information.

伪代码:

    for mat1,mat2,mat3,mat4 in zip([a1,a2,a3,a4],[b1,b2,b3,b4],[c1,c2,c3,c4],[d1,d2,d3,d4]):
        result1 = mat1*mat2^mat3
        result2 = mat1/mat4
        result3 = mat3.T*mat2.T+mat4

所以我真正想做的是在四核计算机上并行处理这个循环的迭代。我已经在这里以及多处理模块上的其他地方阅读了它,它似乎完全适合该法案,除了要求:

So all I really want to do is process the iterations of this loop in parallel on a quad core computer. I've read up here and other places on the multiprocessing module and it seems to fit the bill perfectly except for the required:

   if __name__ == '__main__'

据我所知,这意味着你只能运行多处理代码一个脚本?即如果我这样做:

From what I understand this means that you can only multiprocess code run from a script? ie if I do something like:

    import multiprocessing
    from numpy.random import randn

    a = randn(100,100)
    b = randn(100,100)
    c = randn(100,100)
    d = randn(100,100)

    def process_matrix(mat):
        return mat^2

    if __name__=='__main__':
        print "Multiprocessing"
        jobs=[]

        for input_matrix in [a,b,c,d]:
            p = multiprocessing.Process(target=process_matrix,args=(input_matrix,))
            jobs.append(p)
            p.start()

它运行正常,但假设我将上面保存为'matrix_multiproc.py',并定义了一个新文件'imported_test .py'只是说明:

It runs fine, however assuming I saved the above as 'matrix_multiproc.py', and defined a new file 'importing_test.py' which just states:

    import matrix_multiproc

多重处理不会发生,因为名称现在是'matrix_multiproc'而不是' main '

The multiprocessing does not happen because the name is now 'matrix_multiproc' and not 'main'

这是否意味着我永远不能在导入的模块上使用并行处理?我所要做的就是将我的模型运行为:

Does this mean I can never use parallel processing on an imported module? All I am trying to do is have my model run as:

    def Model_Run():
        import Part1, Part2, Part3, matrix_multiproc, Part4

        Part1.Run()
        Part2.Run()
        Part3.Run()
        matrix_multiproc.Run()
        Part4.Run()

对于一个非常长的问题很抱歉,这可能是一个简单的答案,谢谢!

Sorry for a really long question to what is probably a simple answer, thanks!

推荐答案


这是否意味着我永远不能在导入的模块上使用并行处理?

Does this mean I can never use parallel processing on an imported module?

不,它没有。您可以在代码中的任何位置使用多处理提供程序的主模块使用 if __name__ =='__ main__' 后卫。

No, it doesn't. You can use multiprocessing anywhere in your code, provided that the program's main module uses the if __name__ == '__main__' guard.

在Unix系统上,你甚至不需要那个后卫,因为它具有 fork() 系统调用从主 python 进程创建子进程。

On Unix systems, you won't even need that guard, since it features the fork() system call to create child processes from the main python process.

在Windows上,另一方面, fork() multiprocessing 模拟,生成一个运行主模块的新进程再次,使用不同的 __ name __ 。如果没有这里的警卫,你的主应用程序将尝试再次产生新的进程,导致无限循环,并且非常快地耗尽所有计算机的内存。

On Windows, on the other hand, fork() is emulated by multiprocessing by spawning a new process that runs the main module again, using a different __name__. Without the guard here, your main application will try to spawn new processes again, resulting in an endless loop, and eating up all your computer's memory pretty fast.

这篇关于导入使用MultiProcessing Python的模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆