Python 多处理——共享 id 的单独进程中的全局变量? [英] Python multiprocessing--global variables in separate processes sharing id?

查看:80
本文介绍了Python 多处理——共享 id 的单独进程中的全局变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自这个问题 我了解到:

<块引用>

当您使用多处理打开第二个进程时,一个全新的创建了具有自己全局状态的 Python 实例.那个全球状态不共享,因此子进程对全局所做的更改变量将对父进程不可见.

为了验证这种行为,我制作了一个测试脚本:

导入时间将多处理导入为 mp从多处理导入池x = [0] # 全局def 工人(c):if c == 1: # 等待 proc 2 完成;全局 x 现在被覆盖了吗?时间.sleep(2)print('输入:x =', x, 'with id', id(x), 'in proc', mp.current_process())x[0] = cprint('exit: x =', x, 'with id', id(x), 'in proc', mp.current_process())返回 x[0]池 = 池(进程数 = 2)x_vals = pool.map(worker, [1, 2])print('parent: x =', x, 'with id', id(x), 'in proc', mp.current_process())打印('最终输出',x_vals)

输出(在 CPython 上)类似于

enter: x = [0] with id 140138406834504 in proc exit: x = [2] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-2, started daemon)>输入:x = [0] id 为 140138406834504 in proc 退出:x = [1] id 为 140138406834504 in proc <ForkProcess(ForkPoolWorker-1, started daemon)>父级:x = [0] id 为 140138406834504 在 proc <_MainProcess(MainProcess, started)>最终输出 [1, 2]

xid 在所有进程中共享,而 x 取不同的值,我该如何解释?id 在概念上不是 Python 对象的内存地址吗?我想如果在子进程中克隆内存空间,这是可能的.那么有什么可以用来获取 Python 对象的实际物理内存地址的吗?

解决方案

共享状态

<块引用>

当您使用多处理打开第二个进程时,会创建一个全新的 Python 实例,它具有自己的全局状态.该全局状态不会共享,因此子进程对全局变量所做的更改对父进程是不可见的.

这里的关键点似乎是:

<块引用>

那个全局状态不是共享的......

...指的是子进程的那个全局状态.但这并不意味着父进程的部分全局状态不能与子进程共享,只要子进程不尝试写入这个部分.发生这种情况时,部分将被复制和更改,父项将不可见.

背景:

在 Unix 上 'fork' 是启动子进程的默认方式过程:

<块引用>

父进程使用 os.fork() fork Python 解释器.子进程在开始时实际上与父进程相同.父进程的所有资源都由子进程继承.请注意,安全分叉多线程进程是有问题的.

仅在 Unix 上可用.Unix 上的默认设置.

Fork 是使用 copy-on 实现的-write,因此除非您将新对象分配给 x,否则不会发生复制并且子进程与其父进程共享相同的列表.


内存地址

<块引用>

如何解释 x 的 id 在所有进程中共享的事实,而 x 取不同的值?

Fork 创建一个子进程,其中的虚拟地址空间与父进程的虚拟地址空间相同.虚拟地址将全部映射到相同的物理地址,直到发生写时复制.

<块引用>

现代操作系统使用虚拟寻址.基本上,您在程序中看到的地址值(指针)不是实际的物理内存位置,而是指向索引表(虚拟地址)的指针,该索引表又包含指向实际物理内存位置的指针.由于这种间接性,如果虚拟地址属于不同进程的索引表,您可以让相同的虚拟地址指向不同的物理地址.链接


<块引用>

那么有什么可以用来获取 Python 对象的实际物理内存地址的吗?

似乎没有办法获得实际的物理内存地址(链接).id 返回 虚拟(逻辑)内存地址 (CPython).从虚拟内存地址到物理内存地址的实际转换属于 MMU.

From this question I learned that:

When you use multiprocessing to open a second process, an entirely new instance of Python, with its own global state, is created. That global state is not shared, so changes made by child processes to global variables will be invisible to the parent process.

To verify this behavior, I made a test script:

import time
import multiprocessing as mp
from multiprocessing import Pool
x = [0]  # global
def worker(c):
    if c == 1:  # wait for proc 2 to finish; is global x overwritten by now?
        time.sleep(2)
    print('enter: x =', x, 'with id', id(x), 'in proc', mp.current_process())
    x[0] = c
    print('exit: x =', x, 'with id', id(x), 'in proc', mp.current_process())
    return x[0]

pool = Pool(processes=2)
x_vals = pool.map(worker, [1, 2])
print('parent: x =', x, 'with id', id(x), 'in proc', mp.current_process())
print('final output', x_vals)

The output (on CPython) is something like

enter: x = [0] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-2, started daemon)>
exit: x = [2] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-2, started daemon)>
enter: x = [0] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-1, started daemon)>
exit: x = [1] with id 140138406834504 in proc <ForkProcess(ForkPoolWorker-1, started daemon)>
parent: x = [0] with id 140138406834504 in proc <_MainProcess(MainProcess, started)>
final output [1, 2]

How should I explain the fact that the id of x is shared in all the processes, yet x takes different values? Isn't id conceptually the memory address of a Python object? I guess this is possible if the memory space gets cloned in the child processes. Then is there something I can use to get the actual physical memory address of a Python object?

解决方案

Shared State

When you use multiprocessing to open a second process, an entirely new instance of Python, with its own global state, is created. That global state is not shared, so changes made by child processes to global variables will be invisible to the parent process.

The crucial point here seems to be:

That global state is not shared..."

...refering to that global state of the child process. But that doesn't mean that part of the global state from the parent can't be shared with the child process as long the child process doesn't attempt to write to this part. When this happens, this part get's copied and changed and will not be visible to the parent.

Background:

On Unix 'fork' is the default way for starting the child process:

The parent process uses os.fork() to fork the Python interpreter. The child process, when it begins, is effectively identical to the parent process. All resources of the parent are inherited by the child process. Note that safely forking a multithreaded process is problematic.

Available on Unix only. The default on Unix.

Fork is implemented using copy-on-write, so unless you assign a new object to x no copying takes place and the child process shares the same list with its parent.


Memory address

How should I explain the fact that the id of x is shared in all the processes, yet x takes different values?

Fork creates a child process in which the virtual address space is identical to the virtual address space of the parent. The virtual addresses will all map to the same physical addresses until copy-on-write occurs.

Modern OSes use virtual addressing. Basically the address values (pointers) you see inside your program are not actual physical memory locations, but pointers to an index table (virtual addresses) that in turn contains pointers to the actual physical memory locations. Because of this indirection, you can have the same virtual address point to different physical addresses IF the virtual addresses belong to index tables of separate processes. link


Then is there something I can use to get the actual physical memory address of a Python object?

There doesn't seem to be a way to get the actual physical memory address (link). id returns the virtual (logical) memory address (CPython). The actual translation from virtual to physical memory address falls to the MMU.

这篇关于Python 多处理——共享 id 的单独进程中的全局变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆