python:使用多处理时访问变量的问题 [英] python: problems with accessing variables while using multiprocessing

查看:213
本文介绍了python:使用多处理时访问变量的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是新的多处理概念在python和我有问题访问变量时,我试图包括多处理在我的代码。对不起,如果Iam听起来天真,但我只是无法想出它。下面是我的场景的简单版本。

 类数据:
def __init __(self):
self.data =data
def datameth(self):
print self.data
print mainvar

class First:
def __init __ ):
self.first =first
def firstmeth(self):
d = Data()
d.datameth()
print self.first

def mymethod():
f = First()
f.firstmeth()

如果__name__ =='__main__':
mainvar = mainvar
mymethod()



当我运行这个时,输出:

  data 
mainvar
first
pre>

但是当我尝试运行 mymethod()作为进程时

 从多进程导入进程
类数据:
def __init __(self):
self.data =data
自定义(self):
print self.data
print mainvar

class First:
def __init __(self):
self.first = first
def firstmeth(self):
d = Data()
#print mainvar
d.datameth()
print self.first


def mymethod():
f = First()
f.firstmeth()

如果__name__ =='__main__':
mainvar =mainvar
#mymethod()
p = Process(target = mymethod)
p.start()


我得到这样的错误:

  NameError:global name'mainvar'is未定义

重点是,Iam无法访问 mainvar 第一类或数据类。
我在这里缺少什么?



编辑:
实际上,在我的实际情况下,它不只是声明mainvar,它是返回值

 如果__name__ =='__main__':
***其他一些东西** *
mainvar = ***某些方法的返回值**
p = Process(target = mymethod)
p.start()
pre>

编辑2:
由于@dciriello在注释中提到,它在Linux中正常工作,但不是在Windows:(

解决方案

这是Windows的限制,因为它不支持 fork 。在Linux中分叉,它获得父进程状态的写时复制副本,因此 mainvar 中定义,如果__name__ ==__main__ :将在那里,但在Windows上,通过重新导入程序的 __ main __ 模块创建子进程的状态。意味着 mainvar 在子元素中不存在,因为它只在中创建,如果__name__ ==__main __守卫。所以,如果你需要在子进程中访问 mainvar ,你唯一的选择是显式地将它作为 mymethod 构造函数中的code>

  mainvar =无论
p = Process(target = mymethod,args =(mainvar,))

最佳做法在 多处理 docs


向子进程显式传递资源 b
$ b

在Unix上,子进程可以使用在
父进程中使用全局资源创建的共享资源。但是,最好将
作为参数传递给子进程的构造函数。



除了编写代码(可能)兼容Windows ,这个
还确保只要子进程仍然活着,
对象就不会在父进程中被垃圾收集。


< blockquote>

注意粗体部分 - 虽然它不是很清楚,它有助于Windows兼容性的原因是因为它有助于避免你看到的确切问题。



这也包含在 docs ,具体说明由于缺少 fork 所导致的Windows限制:


全局变量



请记住,如果在子进程中运行的代码试图访问
全局变量,在 Process.start (如果有)的值可能与父进程中的值不同
>



但是,只是模块级别常量的全局变量会导致
没有问题。


注意如果有。因为你的全局变量被声明在如果__name__ ==__main __: guard,它甚至不会显示在孩子中。


I am new to multiprocessing concepts in python and I have problem accessing variables when I try to include multiprocessing in my code. Sorry if Iam sounding naive, but I just cant figure it out. Below is a simple version of my scenario.

class Data:
    def __init__(self):
        self.data = "data"
    def datameth(self):
        print self.data
        print mainvar

class First:
    def __init__(self):
        self.first = "first"
    def firstmeth(self):
        d = Data()
        d.datameth()
        print self.first

def mymethod():
    f = First()
    f.firstmeth()

if __name__ == '__main__':
    mainvar = "mainvar"
    mymethod()

When I run this, its running fine and gives the output:

data
mainvar
first

But when I try to run mymethod()as a process

from multiprocessing import Process
class Data:
    def __init__(self):
        self.data = "data"
    def datameth(self):
        print self.data
        print mainvar

class First:
    def __init__(self):
        self.first = "first"
    def firstmeth(self):
        d = Data()
        #print mainvar
        d.datameth()
        print self.first


def mymethod():
    f = First()
    f.firstmeth()

if __name__ == '__main__':
    mainvar = "mainvar"
    #mymethod()
    p = Process(target = mymethod)
    p.start()

I get an error like this:

NameError: global name 'mainvar' is not defined

The point is, Iam not able to access mainvar from inside First class or Data class. What am I missing here?

Edit: Actually in my real scenario, it is not just declaring mainvar, it is the return value of a method after some processing.

if __name__ == '__main__':
    ***some other stuff***
    mainvar = ***return value of some method**
    p = Process(target = mymethod)
    p.start()

Edit 2: As @dciriello mentioned in comments, It is working fine in Linux but not in Windows :(

解决方案

This is a limitation of Windows, because it doesn't support fork. When a child process is forked in Linux, it gets a copy-on-write replica of the parent's processes state, so the mainvar you defined inside if __name__ == "__main__": will be there. However, on Windows, the child process' state is created by re-importing the __main__ module of the program. This means that mainvar doesn't exist in the children, because it's only created inside the if __name__ == "__main__" guard. So, if you need to access mainvar inside a child process, your only option is to explicitly pass it to the child as an argument to mymethod in the Process constructor:

mainvar = "whatever"
p = Process(target=mymethod, args=(mainvar,))

This best-practice is mentioned in the multiprocessing docs:

Explicitly pass resources to child processes

On Unix a child process can make use of a shared resource created in a parent process using a global resource. However, it is better to pass the object as an argument to the constructor for the child process.

Apart from making the code (potentially) compatible with Windows this also ensures that as long as the child process is still alive the object will not be garbage collected in the parent process.

Notice the bold part - though it's not quite spelled out, the reason it helps with Windows compatibility is because it helps avoid the exact issue you're seeing.

This is also covered in the section of the docs that talks specifically about Windows limitations caused by the lack of fork:

Global variables

Bear in mind that if code run in a child process tries to access a global variable, then the value it sees (if any) may not be the same as the value in the parent process at the time that Process.start was called.

However, global variables which are just module level constants cause no problems.

Note the "if any". Because your global variable is declared inside the if __name__ == "__main__": guard, it doesn't even show up in the child.

这篇关于python:使用多处理时访问变量的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆