python:使用多处理时访问变量的问题 [英] python: problems with accessing variables while using multiprocessing
问题描述
我是新的多处理概念在python和我有问题访问变量时,我试图包括多处理在我的代码。对不起,如果Iam听起来天真,但我只是无法想出它。下面是我的场景的简单版本。
类数据:
def __init __(self):
self.data =data
def datameth(self):
print self.data
print mainvar
class First:
def __init __ ):
self.first =first
def firstmeth(self):
d = Data()
d.datameth()
print self.first
def mymethod():
f = First()
f.firstmeth()
如果__name__ =='__main__':
mainvar = mainvar
mymethod()
当我运行这个时,输出:
data
pre>
mainvar
first
但是当我尝试运行
mymethod()
作为进程时从多进程导入进程
类数据:
def __init __(self):
self.data =data
自定义(self):
print self.data
print mainvar
class First:
def __init __(self):
self.first = first
def firstmeth(self):
d = Data()
#print mainvar
d.datameth()
print self.first
def mymethod():
f = First()
f.firstmeth()
如果__name__ =='__main__':
mainvar =mainvar
#mymethod()
p = Process(target = mymethod)
p.start()
我得到这样的错误:
NameError:global name'mainvar'is未定义
重点是,Iam无法访问
mainvar
从第一
类或数据
类。
我在这里缺少什么?
编辑:
实际上,在我的实际情况下,它不只是声明mainvar,它是返回值如果__name__ =='__main__':
pre>
***其他一些东西** *
mainvar = ***某些方法的返回值**
p = Process(target = mymethod)
p.start()
编辑2:
由于@dciriello在注释中提到,它在Linux中正常工作,但不是在Windows:(解决方案这是Windows的限制,因为它不支持
fork
。在Linux中分叉,它获得父进程状态的写时复制副本,因此mainvar
在中定义,如果__name__ ==__main__ :
将在那里,但在Windows上,通过重新导入程序的__ main __
模块创建子进程的状态。意味着mainvar
在子元素中不存在,因为它只在中创建,如果__name__ ==__main __
守卫。所以,如果你需要在子进程中访问mainvar
,你唯一的选择是显式地将它作为mymethod
构造函数中的code>
mainvar =无论
p = Process(target = mymethod,args =(mainvar,))
最佳做法在
多处理
docs :
向子进程显式传递资源 b
$ b在Unix上,子进程可以使用在
父进程中使用全局资源创建的共享资源。但是,最好将
作为参数传递给子进程的构造函数。
除了编写代码(可能)兼容Windows ,这个
还确保只要子进程仍然活着,
对象就不会在父进程中被垃圾收集。
< blockquote>
注意粗体部分 - 虽然它不是很清楚,它有助于Windows兼容性的原因是因为它有助于避免你看到的确切问题。
这也包含在 docs ,具体说明由于缺少
fork 所导致的Windows限制:
全局变量
请记住,如果在子进程中运行的代码试图访问
全局变量,在Process.start
(如果有)的值可能与父进程中的值不同
>
但是,只是模块级别常量的全局变量会导致
没有问题。
注意如果有。因为你的全局变量被声明在
如果__name__ ==__main __:
guard,它甚至不会显示在孩子中。I am new to multiprocessing concepts in python and I have problem accessing variables when I try to include multiprocessing in my code. Sorry if Iam sounding naive, but I just cant figure it out. Below is a simple version of my scenario.
class Data: def __init__(self): self.data = "data" def datameth(self): print self.data print mainvar class First: def __init__(self): self.first = "first" def firstmeth(self): d = Data() d.datameth() print self.first def mymethod(): f = First() f.firstmeth() if __name__ == '__main__': mainvar = "mainvar" mymethod()
When I run this, its running fine and gives the output:
data mainvar first
But when I try to run
mymethod()
as a processfrom multiprocessing import Process class Data: def __init__(self): self.data = "data" def datameth(self): print self.data print mainvar class First: def __init__(self): self.first = "first" def firstmeth(self): d = Data() #print mainvar d.datameth() print self.first def mymethod(): f = First() f.firstmeth() if __name__ == '__main__': mainvar = "mainvar" #mymethod() p = Process(target = mymethod) p.start()
I get an error like this:
NameError: global name 'mainvar' is not defined
The point is, Iam not able to access
mainvar
from insideFirst
class orData
class. What am I missing here?Edit: Actually in my real scenario, it is not just declaring mainvar, it is the return value of a method after some processing.
if __name__ == '__main__': ***some other stuff*** mainvar = ***return value of some method** p = Process(target = mymethod) p.start()
Edit 2: As @dciriello mentioned in comments, It is working fine in Linux but not in Windows :(
解决方案This is a limitation of Windows, because it doesn't support
fork
. When a child process is forked in Linux, it gets a copy-on-write replica of the parent's processes state, so themainvar
you defined insideif __name__ == "__main__":
will be there. However, on Windows, the child process' state is created by re-importing the__main__
module of the program. This means thatmainvar
doesn't exist in the children, because it's only created inside theif __name__ == "__main__"
guard. So, if you need to accessmainvar
inside a child process, your only option is to explicitly pass it to the child as an argument tomymethod
in theProcess
constructor:mainvar = "whatever" p = Process(target=mymethod, args=(mainvar,))
This best-practice is mentioned in the
multiprocessing
docs:Explicitly pass resources to child processes
On Unix a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process.
Apart from making the code (potentially) compatible with Windows this also ensures that as long as the child process is still alive the object will not be garbage collected in the parent process.
Notice the bold part - though it's not quite spelled out, the reason it helps with Windows compatibility is because it helps avoid the exact issue you're seeing.
This is also covered in the section of the docs that talks specifically about Windows limitations caused by the lack of
fork
:Global variables
Bear in mind that if code run in a child process tries to access a global variable, then the value it sees (if any) may not be the same as the value in the parent process at the time that
Process.start
was called.However, global variables which are just module level constants cause no problems.
Note the "if any". Because your global variable is declared inside the
if __name__ == "__main__":
guard, it doesn't even show up in the child.这篇关于python:使用多处理时访问变量的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!