为什么`print`在Python多处理pool.map中不起作用 [英] Why doesn't `print` work in Python multiprocessing pool.map
问题描述
我正在尝试使用multiprocessing
模块来处理大型csv文件.我正在使用Python 2.7,并按照此处中的示例进行操作.
I am trying to implement the multiprocessing
module for a working with a large csv file. I am using Python 2.7 and following the example from here.
我运行了未修改的代码(为方便起见,在下面复制),并注意到worker
函数中的print
语句不起作用.无法使用print
使得难以理解流程和调试.
I ran the unmodified code (copied below for convenience) and noticed that print
statements within the worker
function do not work. The inability to print
makes it difficult to understand the flow and debug.
任何人都可以解释为什么print
在这里不起作用吗? pool.map是否不执行打印命令?我在网上搜索,但没有找到任何可以表明这一点的文档.
Can anyone please explain why print
is not working here? Does pool.map not execute print commands? I searched online but did not find any documentation that would indicate this.
import multiprocessing as mp
import itertools
import time
import csv
def worker(chunk):
# `chunk` will be a list of CSV rows all with the same name column
# replace this with your real computation
print(chunk) # <----- nothing prints
print 'working' # <----- nothing prints
return len(chunk)
def keyfunc(row):
# `row` is one row of the CSV file.
# replace this with the name column.
return row[0]
def main():
pool = mp.Pool()
largefile = 'test.dat'
num_chunks = 10
results = []
with open(largefile) as f:
reader = csv.reader(f)
chunks = itertools.groupby(reader, keyfunc)
while True:
# make a list of num_chunks chunks
groups = [list(chunk) for key, chunk in
itertools.islice(chunks, num_chunks)]
if groups:
result = pool.map(worker, groups)
results.extend(result)
else:
break
pool.close()
pool.join()
print(results)
if __name__ == '__main__':
main()
推荐答案
这是IDLE的问题,您正在使用它运行代码. IDLE对终端进行了相当基本的仿真,以处理您在其中运行的程序的输出.尽管它不能处理子流程,所以尽管它们在后台运行得很好,但您永远看不到它们的输出.
This is an issue with IDLE, which you're using to run your code. IDLE does a fairly basic emulation of a terminal for handling the output of a program you run in it. It cannot handle subprocesses though, so while they'll run just fine in the background, you'll never see their output.
最简单的解决方法是仅从命令行运行代码.
The simplest fix is to simply run your code from the command line.
另一种选择是使用更复杂的IDE.在 Python Wiki 上列出了一堆,尽管我不确定哪个对多处理有更好的终端仿真输出.
An alternative might be to use a more sophisticated IDE. There are a bunch of them listed on the Python wiki, though I'm not sure which ones have better terminal emulation for multiprocessing output.
这篇关于为什么`print`在Python多处理pool.map中不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!