为什么会抛出“'模块'对象没有属性XXX"?我从multiprocessing.Pool调用apply_async时发生错误? [英] Why would it throws "'module' object has no attribute XXX" error when I call on apply_async from multiprocessing.Pool?
问题描述
代码如下.当我在cmd提示符下复制并粘贴它时,它会抛出'module'对象没有属性'func',但是当我将其另存为 .py 文件时并执行python test.py
,就可以正常工作.
The code is as below. When I copy-and-paste it in my cmd prompt, it throws 'module' object has no attribute 'func', but when I save it as a .py file and execute python test.py
, it just works fine.
import multiprocessing
import time
def func(msg):
for i in xrange(3):
print msg
time.sleep(1)
if __name__ == '__main__':
pool = multiprocessing.Pool(processes=4)
for i in xrange(5):
msg = "hello %d" %(i)
pool.apply_async(func, (msg, ))
pool.close()
pool.join()
print "Sub-process(es) done."
在运行python代码时,有人能给我解释一下提示和文件之间的区别吗?非常感谢!
Could anyone give me an explanation on the difference between in prompt and in file when running a python code? Thanks a lot!
推荐答案
之所以会这样,是因为在Windows上,需要对func
进行腌制并通过IPC发送给子进程.为了使子进程释放func
的位置,它需要能够从父级的__main__
模块导入它.在普通的Python脚本中发生这种情况时,子级可以重新导入您的脚本,并且__main__
将包含在脚本顶层声明的所有函数,因此可以正常工作.但是,在交互式解释器中,您不能像在普通脚本中那样简单地从文件中重新导入在解释器中定义的函数,因此它们将不位于__main__
这个孩子.如果您直接使用multiprocessing.Process
重新创建问题,则更清楚了:
This is happening because on Windows, func
needs to be pickled and sent to the child process via IPC. In order for the child to unpickle func
, it needs to be able to import it from the parent's __main__
module. When this happens in a normal Python script, the child can re-import your script, and __main__
will contain all the functions declared at the top-level of your script, so it works fine. However, in the interactive interpreter, functions you've defined while in the interpreter can't simply be re-imported from a file like in a normal script, so they will not be in __main__
in the child. This is more clear if you use multiprocessing.Process
directly to recreate the issue:
>>> def f():
... print "HI"
...
>>> import multiprocessing
>>> p = multiprocessing.Process(target=f)
>>> p.start()
>>> Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\python27\lib\multiprocessing\forking.py", line 381, in main
self = load(from_parent)
File "C:\python27\lib\pickle.py", line 1378, in load
return Unpickler(file).load()
File "C:\python27\lib\pickle.py", line 858, in load
dispatch[key](self)
File "C:\python27\lib\pickle.py", line 1090, in load_global
klass = self.find_class(module, name)
File "C:\python27\lib\pickle.py", line 1126, in find_class
klass = getattr(mod, name)
AttributeError: 'module' object has no attribute 'f'
这样,更清楚地发现pickle
找不到模块.如果向pickle.py
添加一些跟踪,则可以看到'module'
指的是__main__
:
This way, it's more clear that pickle
can't find the module. If you add some tracing to pickle.py
you can see that 'module'
is referring to __main__
:
def load_global(self):
module = self.readline()[:-1]
name = self.readline()[:-1]
print("module {} name {}".format(module, name)) # I added this.
klass = self.find_class(module, name)
self.append(klass)
使用该额外的print语句再次运行相同的代码将产生以下结果:
Rrerunning the same code again with that extra print statement yields this:
module multiprocessing.process name Process
module __main__ name f
< same traceback as before>
值得注意的是,此示例实际上在Posix平台上可以正常工作,因为os.fork()
用于生成子进程,这意味着在Pool
创建之前定义的任何函数都将在子级Pool
之后(意味着在调用os.fork()
之后),在 之后定义了worker函数:
It's worth noting that this example actually works fine on Posix platforms, because os.fork()
is used to spawn the child processes, which means that any function defined prior to the Pool
being created will be available in the child's __main__
module. So, while the above example will work, this one will still fail, because the worker function is defined after creating the Pool
(which means after os.fork()
is called):
>>> import multiprocessing
>>> p = multiprocessing.Pool(2)
>>> def f(a):
... print(a)
...
>>> p.apply(f, "hi")
Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 231, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib64/python2.6/multiprocessing/pool.py", line 57, in worker
task = get()
File "/usr/lib64/python2.6/multiprocessing/queues.py", line 339, in get
return recv()
AttributeError: 'module' object has no attribute 'f'
这篇关于为什么会抛出“'模块'对象没有属性XXX"?我从multiprocessing.Pool调用apply_async时发生错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!