Python:多处理池中的收益 [英] Python: Yield in multiprocessing Pool
问题描述
我必须并行化涉及某个产量"的功能.这只是我要处理的整个程序的简单副本,但总结了我所面临的问题.在这里,我尝试了解我的项目的多重处理,apply_async和收益 在此示例中,我使用了multiprocessing.pool并使用了apply_async进行并行化.我已经在"parallel"函数中放置了一些打印语句,但是它们没有被打印出来. 当我用return替换yield时,打印语句得到反映.我不确定产量的性质.我知道它是一个生成器,返回后只能使用一次.请提供有关如何使它工作的建议.
I've to Parallelize a function which involves a certain "yield". This is only a simple replica of the whole program that I've to work on, but sums up the problems i'm facing. Here I'm try to understand multiprocessing, apply_async and yield for my project In this example I've used a multiprocessing.pool and have used the apply_async to parallelize. I've put some print statements in the "parallel" function, but they aren't getting printed. When i replace yield with return the the print statements are getting reflected. I'm not certain about the nature of yield. I know its a generator and can be used only once after its returned. Please advise on how to get this working.
import multiprocessing as mp
results=[]
def parallel(x, y, z):
print "aim in parallel"
count=0
result=[]
for line in range(10000):
count+=1
result.append(count)
p=x**3+y+z
print " result"
print result
print p
if p > 0:
return result
# yield result, p
# count += 1
# yield p, result
# count += 1
def collect_results(result):
print "aim in callback"
results.append(result)
#print results
def apply_async_with_callback():
pool = mp.Pool(processes=10)
r = range(10)
[pool.apply_async(parallel, args=(2,5, 7),callback=collect_results) for i in r ]
pool.close()
pool.join()
print "length"
print len(results)
print results
if __name__ == "__main__":
apply_async_with_callback()
推荐答案
调用包含yield
语句的函数时,它实际上不会运行代码,而是返回生成器:
When a function containing a yield
statement is called, it doesn't actually run the code but returns a generator instead:
>>> p = parallel(1, 2, 3)
>>> p
<generator object parallel at 0x7fde9c1daf00>
然后,当需要下一个值时,代码将运行直到产生一个值:
Then, when the next value is required, the code will run until a value is yielded:
>>> next(p)
([10000], 6)
>>> next(p)
(6, [10000])
在您的情况下,results
包含10个异步创建的生成器,但它们从未真正运行过.
In your case, results
contains 10 generators that have been created asynchronously, but they've never been actually run.
如果要使用生成器,则可以稍稍更改代码以定位从生成器创建列表的函数:
If you want to use a generator, you could change your code a bit to target a function that creates a list from the generator:
def parallel2(x, y, z):
return list(parallel(x, y, z))
def collect_results(lst):
results.extend(lst)
def apply_async_with_callback():
pool = mp.Pool()
for _ in range(10):
pool.apply_async(parallel2, args=(2, 5, 7),
callback=collect_results)
这篇关于Python:多处理池中的收益的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!