当我调用 multiprocessing.Process 时正在腌制什么? [英] What is being pickled when I call multiprocessing.Process?

查看:52
本文介绍了当我调用 multiprocessing.Process 时正在腌制什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道 multiprocessing 使用酸洗来让进程在不同的 CPU 上运行,但我想我对酸洗的内容有点困惑.让我们看看这段代码.

from multiprocessing import Process定义 f(I):打印('你好世界!',我)如果 __name__ == '__main__':对于 I in (range1, 3):进程(目标=f,args=(I,)).start()

我假设被腌制的是 def f(I) 和传入的参数.首先,这个假设是否正确?

其次,假设 f(I) 中有一个函数调用,例如:

def f(I):打印('你好世界!',我)随机函数()

randomfunction 的定义是否也被腌制了,还是只是函数调用?

此外,如果该函数调用位于另一个文件中,进程是否能够调用它?

解决方案

在这个特定的例子中,被腌制的是平台相关的.在支持 os.fork 的系统上,比如 Linux,这里没有任何东西被腌制.您传递的目标函数和参数都通过 fork 由子进程继承.

在不支持 fork 的平台上,如 Windows,f 函数和 args 元组都将被腌制并发送到子进程.子进程将重新导入您的 __main__ 模块,然后解开函数及其参数.

在任何一种情况下,randomfunction 实际上都没有被腌制.当您腌制 f 时,您真正腌制的只是一个指针,用于子函数重新构建 f 函数对象.这通常只是一个告诉孩子如何重新导入 f 的字符串:

<预><代码>>>>定义 f(I):...打印('你好世界!',我)...随机函数()...>>>pickle.dumps(f)'c__main__\nf\np0\n.'

子进程只会重新导入f,然后调用它.randomfunction 只要正确导入到原始脚本中,就可以访问.

请注意,在 Python 3.4+ 中,您可以使用 上下文:

ctx = multiprocessing.get_context('spawn')ctx.Process(target=f,args=(I,)).start() # 即使在 Linux 上,这也会使用 pickle

上下文的描述也可能与此处相关,因为它们也适用于 Python 2.x:

<块引用>

生成

父进程启动一个新的python解释器进程.子进程只会继承运行所需的资源进程对象 run() 方法.特别是不需要的文件父进程的描述符和句柄不会被继承.与使用相比,使用此方法启动进程的速度相当慢fork 或 forkserver.

适用于 Unix 和 Windows.Windows 上的默认设置.

父进程使用 os.fork() fork Python 解释器.子进程在开始时实际上与父进程.父级的所有资源都由子级继承过程.请注意,安全分叉多线程进程是有问题.

仅在 Unix 上可用.Unix 上的默认设置.

forkserver

程序启动时选择forkserver启动方法,启动一个服务器进程.从此,每当一个新的需要进程,父进程连接到服务器和请求它派生一个新进程.fork 服务器进程单一线程,因此使用 os.fork() 是安全的.没有不必要的资源是继承的.

在支持传递文件描述符的 Unix 平台上可用通过 Unix 管道.

请注意,forkserver 仅在 Python 3.4 中可用,无论您使用何种平台,都无法在 2.x 上获得该行为.

I know that multiprocessing uses pickling in order to have the processes run on different CPUs, but I think I am a little confused as to what is being pickled. Lets look at this code.

from multiprocessing import Process

def f(I):
    print('hello world!',I)

if __name__ == '__main__':
    for I in (range1, 3):
        Process(target=f,args=(I,)).start()

I assume what is being pickled is the def f(I) and the argument going in. First, is this assumption correct?

Second, lets say f(I) has a function call within in it like:

def f(I):
    print('hello world!',I)
    randomfunction()

Does the randomfunction's definition get pickled as well, or is it only the function call?

Further more, if that function call was located in another file, would the process be able to call it?

解决方案

In this particular example, what gets pickled is platform dependent. On systems that support os.fork, like Linux, nothing is pickled here. Both the target function and the args you're passing get inherited by the child process via fork.

On platforms that don't support fork, like Windows, the f function and args tuple will both be pickled and sent to the child process. The child process will re-import your __main__ module, and then unpickle the function and its arguments.

In either case, randomfunction is not actually pickled. When you pickle f, all you're really pickling is a pointer for the child function to re-build the f function object. This is usually little more than a string that tells the child how to re-import f:

>>> def f(I):
...     print('hello world!',I)
...     randomfunction()
... 
>>> pickle.dumps(f)
'c__main__\nf\np0\n.'

The child process will just re-import f, and then call it. randomfunction will be accessible as long as it was properly imported into the original script to begin with.

Note that in Python 3.4+, you can get the Windows-style behavior on Linux by using contexts:

ctx = multiprocessing.get_context('spawn')
ctx.Process(target=f,args=(I,)).start()  # even on Linux, this will use pickle

The descriptions of the contexts are also probably relevant here, since they apply to Python 2.x as well:

spawn

The parent process starts a fresh python interpreter process. The child process will only inherit those resources necessary to run the process objects run() method. In particular, unnecessary file descriptors and handles from the parent process will not be inherited. Starting a process using this method is rather slow compared to using fork or forkserver.

Available on Unix and Windows. The default on Windows.

fork

The parent process uses os.fork() to fork the Python interpreter. The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic.

Available on Unix only. The default on Unix.

forkserver

When the program starts and selects the forkserver start method, a server process is started. From then on, whenever a new process is needed, the parent process connects to the server and requests that it fork a new process. The fork server process is single threaded so it is safe for it to use os.fork(). No unnecessary resources are inherited.

Available on Unix platforms which support passing file descriptors over Unix pipes.

Note that forkserver is only available in Python 3.4, there's no way to get that behavior on 2.x, regardless of the platform you're on.

这篇关于当我调用 multiprocessing.Process 时正在腌制什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆