Python多处理池,加入;等不及要继续? [英] Python multiprocessing pool, join; not waiting to go on?
问题描述
(1)我正在尝试使用pool.map
后跟pool.join()
,但是python似乎并没有等待pool.map
完成,然后再通过pool.join()
.这是我尝试过的简单示例:
(1) I'm trying to use pool.map
followed by pool.join()
, but python doesn't seem to be waiting for pool.map
to finish before going on past the pool.join()
. Here's a simple example of what I've tried:
from multiprocessing import Pool
foo = {1: []}
def f(x):
foo[1].append(x)
print foo
def main():
pool = Pool()
pool.map(f, range(100))
pool.close()
pool.join()
print foo
if __name__ == '__main__':
main()
打印的输出仅为{1: []}
,好像python只是忽略了join
命令并在有机会运行f
之前运行了print foo
一样.预期的结果是foo
是{1:[0,1,...,99]}
,并使用普通的内置python map
给出了此结果.为什么合并版本打印{1: []}
,如何更改代码以打印预期结果?
The printed output is just {1: []}
, as if python just ignored the join
command and ran print foo
before it had a chance to run f
. The intended result is that foo
is {1:[0,1,...,99]}
, and using the ordinary built-in python map
gives this result. Why is the pooled version printing {1: []}
, and how can I change my code to make it print the intended result?
(2)理想情况下,我还想将foo
定义为main()
中的局部变量,并将其传递给f
,
但可以通过将foo
用作f
的第一个参数并使用
(2) Ideally I'd also like to define foo
as a local variable in main()
and pass it to f
,
but doing this by making foo
the first argument of f
and using
pool.map(functools.partial(f, foo), range(100))
产生相同的输出. (并且可能还存在每个进程现在都有其自己的foo
副本的问题?)尽管如此,它还是使用普通的map
代替.
produces the same output. (and possibly also has the problem that each process now has its own copy of foo
?) Though again, it works using the normal map
instead.
推荐答案
这不是使用map
的正确方法.
This is not the correct way to use map
.
- 以这种方式使用全局变量是绝对错误的.进程不共享相同的内存(通常),因此每个
f
都将拥有自己的foo
副本.要在不同进程之间共享变量,您应该使用Manager
通常,传递给 - 函数应该返回一个值.
map
的- Using a global variable that way is absolutely wrong. Processes do not share the same memory (normally) so every
f
will have his own copy offoo
. To share a variable between different processes you should use aManager
- Function passed to
map
are, usually, expected to return a value.
我建议您阅读一些文档.
不过,这是一个如何实现它的虚拟示例:
However here is a dummy example of how you could implement it:
from multiprocessing import Pool
foo = {1: []}
def f(x):
return x
def main():
pool = Pool()
foo[1] = pool.map(f, range(100))
pool.close()
pool.join()
print foo
if __name__ == '__main__':
main()
您还可以执行类似pool.map(functools.partial(f, foo), range(100))
的操作,其中foo
是Manager
.
You may also do something like pool.map(functools.partial(f, foo), range(100))
where foo
is a Manager
.
这篇关于Python多处理池,加入;等不及要继续?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!