python中的多进程会重新初始化全局变量吗? [英] Does multiprocess in python re-initialize globals?

查看:576
本文介绍了python中的多进程会重新初始化全局变量吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个多处理程序,无法使用全局变量.我有一个像这样启动的程序:-

I have a multiprocessing program where I'm unable to work with global variables. I have a program which starts like this:-

from multiprocessing import Process ,Pool
print ("Initializing")
someList = []
...
...
... 

这意味着我有someList变量,这些变量在调用main之前已初始化.

Which means I have someList variables which get initialized before my main is called.

稍后将代码中的someList设置为某个值,然后创建4个处理该过程的

Later on in the code someList is set to some value and then I create 4 processes to process it

pool = Pool(4)
combinedResult = pool.map(processFn, someList)
pool.close()
pool.join()

在生成进程之前,将someList设置为有效值.

Before spawning the processes, someList is set to a valid value.

但是,当生成进程时,我看到此打印4次! Initializing Initializing Initializing Initializing

However, when the processes are spawned, I see this print 4 times !! Initializing Initializing Initializing Initializing

很明显,在每个进程中,都将调用程序顶部的初始化部分.同样,someList设置为空.如果我的理解是正确的,那么每个进程都应该是当前进程状态的副本,这实质上意味着,我应该获得同一列表的4个副本.为什么再次重新初始化全局变量?实际上,为什么还要运行该部分?

As it is clear in each process the initialization section at the top of the program is getting called. Also, someList gets set to empty. If my understanding is correct, each process should be a replica of the current process's state which essentially means, I should have got 4 copies of the same list. Why are the globals being re-initialized again? And in fact, why is that section even being run?

有人可以向我解释一下吗?我提到了python文档,但无法确定根本原因.他们确实建议不要使用全局变量,我知道这一点,但是它仍然没有解释对初始化函数的调用.另外,我想使用多处理而不是多线程.我试图了解多处理在这里的工作原理.

Can someone please explain this to me? I referred to python docs but wasn't able to determine the root cause. They do recommend against using globals and I'm aware of it, but it still doesn't explain the call to the initialization function. Also, I'd like to use multiprocessing and not multithreading. I'm trying to understand how multiprocessing works here.

感谢您的时间.

推荐答案

在Windows中,进程不像Linux/Unix中那样被派生.相反,它们是 spawned 的意思,这意味着将为每个新的multiprocessing.Process启动一个新的Python解释器.这意味着所有全局变量都将重新初始化,并且如果您在此过程中以某种方式对其进行了操作,则生成的进程将不会看到该变量.

In Windows processes are not forked as in Linux/Unix. Instead they are spawned, which means that a new Python interpreter is started for each new multiprocessing.Process. This means that all global variables are re-initialized and if you have somehow manipulated them along the way, this will not be seen by the spawned processes.

该问题的一种解决方案是将全局变量传递给Pool initilaizer,然后在生成的过程中也将其从global传递给global:

A solution to the problem is to pass the globals to the Pool initilaizer and then from there make it global also in the spawned process:

from multiprocessing import Pool

def init_pool(the_list):
    global some_list
    some_list = the_list

def access_some_list(index):
    return some_list[index]

if __name__ == "__main__":
    some_list = [24, 12, 6, 3]
    indexes = [3, 2, 1, 0]
    pool = Pool(initializer=init_pool, initargs=(some_list,))
    result = pool.map(access_some_list, indexes)
    print(result)

在此设置中,您会将全局变量复制到每个新进程,然后可以访问它们,但是,像往常一样,从那里开始所做的任何更新都不会传播到任何其他进程.为此,您将需要适当的multiprocessing.Manager.

In this setup, you will copy the globals to each new process and they will then be accessible, however, as always, any updates done from there on will not be propagated to any other process. For that you will need something like a proper multiprocessing.Manager.

作为一个额外的评论,从这里开始很明显全局变量可能很危险,因为很难理解它们在不同过程中将采用什么值.

As an extra comment, from here it is clear that global variables can be dangerous, because it is hard to understand what values they will take in the different processes.

这篇关于python中的多进程会重新初始化全局变量吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆