matplotlib savefig性能,在循环内保存多个png [英] matplotlib savefig performance, saving multiple pngs within loop

查看:204
本文介绍了matplotlib savefig性能,在循环内保存多个png的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望找到一种优化以下情况的方法.我有一个用matplotlib的imshow创建的大轮廓图.然后,我想使用此轮廓图来创建大量的png图像,其中每个图像都是通过更改x和y限制以及宽高比来绘制轮廓图像的一小部分.

I'm hoping to find a way to optimise the following situation. I have a large contour plot created with imshow of matplotlib. I then want to use this contour plot to create a large number of png images, where each image is a small section of the contour image by changing the x and y limits and the aspect ratio.

因此循环中没有绘图数据发生变化,每个png图像之间只有轴限制和宽高比都发生了变化.

So no plot data is changing in the loop, only the axis limits and the aspect ratio are changing between each png image.

以下MWE在"figs"文件夹中创建了70张png图像,展示了简化的想法. fig.savefig('figs/'+filename)占用了大约80%的运行时.

The following MWE creates 70 png images in a "figs" folder demonstrating the simplified idea. About 80% of the runtime is taken up by fig.savefig('figs/'+filename).

我研究了以下内容,但未提出改进建议:

I've looked into the following without coming up with an improvement:

  • matplotlib的另一种选择,着重于速度-我一直在努力寻找具有类似要求的轮廓/曲面图的任何示例/文档
  • 多处理-我在这里看到的类似问题似乎要求在循环内调用fig = plt.figure()ax.imshow,因为无花果和斧头不能被腌制.就我而言,这将比实现多处理所获得的任何速度提高都更为昂贵.
  • An alternative to matplotlib with a focus on speed -- I've struggled to find any examples/documentation of contour/surface plots with similar requirements
  • Multiprocessing -- Similar questions I've seen here appear to require fig = plt.figure() and ax.imshow to be called within the loop, since fig and ax can't be pickled. In my case this will be more expensive than any speed gains achieved by implementing multiprocessing.

如果您有任何见解或建议,我将不胜感激.

I'd appreciate any insight or suggestions you might have.

import numpy as np
import matplotlib as mpl
mpl.use('agg')
import matplotlib.pyplot as plt
import time, os

def make_plot(x, y, fix, ax):
    aspect = np.random.random(1)+y/2.0-x
    xrand = np.random.random(2)*x
    xlim = [min(xrand), max(xrand)]
    yrand = np.random.random(2)*y
    ylim = [min(yrand), max(yrand)]
    filename = '{:d}_{:d}.png'.format(x,y)

    ax.set_aspect(abs(aspect[0]))
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
    fig.savefig('figs/'+filename)

if not os.path.isdir('figs'):
    os.makedirs('figs')
data = np.random.rand(25, 25)

fig = plt.figure()
ax = fig.add_axes([0., 0., 1., 1.])
# in the real case, imshow is an expensive calculation which can't be put inside the loop
ax.imshow(data, interpolation='nearest')

tstart = time.clock()
for i in range(1, 8):
    for j in range(3, 13):
        make_plot(i, j, fig, ax)

print('took {:.2f} seconds'.format(time.clock()-tstart))

推荐答案

由于这种情况下的局限性是对plt.savefig()的调用,因此无法对其进行很多优化.在内部,该图是从头开始渲染的,需要一段时间.可能减少要绘制的顶点数可能会减少时间.

Since the limitation in this case is the call to plt.savefig() it cannot be optimized a lot. Internally the figure is rendered from scratch and that takes a while. Possibly reducing the number of vertices to be drawn might reduce the time a bit.

在我的计算机上运行代码的时间(Win 8,带有4核3.5GHz的i5)为2.5秒.这似乎还不错.使用 Multiprocessing 可以有所改善.

The time to run your code on my machine (Win 8, i5 with 4 cores 3.5GHz) is 2.5 seconds. This seems not too bad. One can get a little improvement by using Multiprocessing.

关于多处理的注释:在multiprocessing中使用pyplot的状态机应该可以工作,这似乎令人惊讶.但是,确实如此. 在这种情况下,由于每张图像都基于相同的图形和轴对象,因此甚至不必创建新的图形和轴.

A note about Multiprocessing: It may seem surprising that using the state machine of pyplot inside multiprocessing should work at all. But it does. And in this case here, since every image is based on the same figure and axes object, one does not even have to create new figures and axes.

我修改了我在这里给出的答案对于您的情况而言,使用多处理和5个进程(在4个内核上)的总时间大约减少了.我附加了一个显示多处理效果的小节.

I modified an answer I gave here a while ago for your case and the total time is roughly halved using multiprocessing and 5 processes on 4 cores. I appended a barplot which shows the effect of multiprocessing.

import numpy as np
#import matplotlib as mpl
#mpl.use('agg') # use of agg seems to slow things down a bit
import matplotlib.pyplot as plt
import multiprocessing
import time, os

def make_plot(d):
    start = time.clock()
    x,y=d
    #using aspect in this way causes a warning for me
    #aspect = np.random.random(1)+y/2.0-x 
    xrand = np.random.random(2)*x
    xlim = [min(xrand), max(xrand)]
    yrand = np.random.random(2)*y
    ylim = [min(yrand), max(yrand)]
    filename = '{:d}_{:d}.png'.format(x,y)
    ax = plt.gca()
    #ax.set_aspect(abs(aspect[0]))
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
    plt.savefig('figs/'+filename)
    stop = time.clock()
    return np.array([x,y, start, stop])

if not os.path.isdir('figs'):
    os.makedirs('figs')
data = np.random.rand(25, 25)

fig = plt.figure()
ax = fig.add_axes([0., 0., 1., 1.])
ax.imshow(data, interpolation='nearest')


some_list = []
for i in range(1, 8):
    for j in range(3, 13):
        some_list.append((i,j))


if __name__ == "__main__":
    multiprocessing.freeze_support()
    tstart = time.clock()
    print tstart
    num_proc = 5
    p = multiprocessing.Pool(num_proc)

    nu = p.map(make_plot, some_list)

    tooktime = 'Plotting of {} frames took {:.2f} seconds'
    tooktime = tooktime.format(len(some_list), time.clock()-tstart)
    print tooktime
    nu = np.array(nu)

    plt.close("all")
    fig, ax = plt.subplots(figsize=(8,5))
    plt.suptitle(tooktime)
    ax.barh(np.arange(len(some_list)), nu[:,3]-nu[:,2], 
            height=np.ones(len(some_list)), left=nu[:,2],  align="center")
    ax.set_xlabel("time [s]")
    ax.set_ylabel("image number")
    ax.set_ylim([-1,70])
    plt.tight_layout()
    plt.savefig(__file__+".png")
    plt.show()

这篇关于matplotlib savefig性能,在循环内保存多个png的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆