通过多重处理保存多个matplotlib图形 [英] Saving multiple matplotlib figures with multiprocessing
问题描述
我有一个代码,可以从名为001.txt,002.txt,...,411.txt的多个文件中读取数据.我想从每个文件中读取数据,对其进行绘制,然后另存为001.jpg,002.jpg,...,411.jpg.
I have a code which reads data from multiple files named 001.txt, 002.txt, ... , 411.txt. I would like to read the data from each file, plot them, and save as 001.jpg, 002.jpg, ... , 411.jpg.
我可以通过遍历文件来做到这一点,但是我想使用多进程模块来加快处理速度.
但是,当我使用下面的代码时,计算机挂起-我无法单击任何东西,但是鼠标移动了,声音仍在继续.然后,我必须关闭计算机电源.
I can do this by looping through the files, but I would like to use the multiprocess module to speed things up.
However, when I use the code below, the computer hangs- I can't click on anything, but the mouse moves, and the sound continues. I then have to power down the computer.
我显然在将多进程模块与matplotlib混用.我使用了与下面的代码非常相似的东西来实际生成数据,并毫无问题地保存到文本文件中.我想念什么?
I'm obviously misusing the multiprocess module with matplotlib. I have used something very similar to the below code to actually generate the data, and save to text files with no problems. What am I missing?
import multiprocessing
def do_plot(number):
fig = figure(number)
a, b = random.sample(range(1,9999),1000), random.sample(range(1,9999),1000)
# generate random data
scatter(a, b)
savefig("%03d" % (number,) + ".jpg")
print "Done ", number
close()
for i in (0, 1, 2, 3):
jobs = []
# for j in chunk:
p = multiprocessing.Process(target = do_plot, args = (i,))
jobs.append(p)
p.start()
p.join()
推荐答案
使用multiprocessing
的最重要的事情是仅对主进程运行模块的主代码.这可以通过测试if __name__ == '__main__'
来实现,如下所示:
The most important thing in using multiprocessing
is to run the main code of the module only for the main process. This can be achieved by testing if __name__ == '__main__'
as shown below:
import matplotlib.pyplot as plt
import numpy.random as random
from multiprocessing import Pool
def do_plot(number):
fig = plt.figure(number)
a = random.sample(1000)
b = random.sample(1000)
# generate random data
plt.scatter(a, b)
plt.savefig("%03d.jpg" % (number,))
plt.close()
print("Done ", number)
if __name__ == '__main__':
pool = Pool()
pool.map(do_plot, range(4))
还请注意,我用进程池代替了单独进程的创建(进程池可更好地缩放到许多图片,因为它只使用与可用内核一样多的进程).
Note also that I replaced the creation of the separate processes by a process pool (which scales better to many pictures since it only uses as many process as you have cores available).
这篇关于通过多重处理保存多个matplotlib图形的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!