Python中最快的2D卷积或图像过滤器 [英] Fastest 2D convolution or image filter in Python

查看:196
本文介绍了Python中最快的2D卷积或图像过滤器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

几个用户以numpy或scipy询问了图像卷积的速度或内存消耗[ 1 2 3

  1. 输入矩阵分别为2048x2048和32x32.
  2. 单精度或双精度浮点数都是可以接受的.
  3. 将输入矩阵转换为适当格式所花费的时间不算在内-仅是卷积步骤.
  4. 将输入矩阵替换为输出是可以接受的(任何python库都支持吗?)
  5. 直接调用常见C库的DLL很好-lapack或scalapack
  6. PyCUDA就可以了.使用您的自定义GPU硬件是不公平的.

这实际上取决于您要执行的操作...很多时候,您不需要完全通用的(阅读速度较慢)2D卷积...(即,如果滤波器是可分离的,则改用两个1D卷积...这就是为什么各种scipy.ndimage.gaussianscipy.ndimage.uniform比比作为通用nD卷积实现的相同对象要快得多的原因.)

无论如何,作为比较点:

t = timeit.timeit(stmt='ndimage.convolve(x, y, output=x)', number=1,
setup="""
import numpy as np
from scipy import ndimage
x = np.random.random((2048, 2048)).astype(np.float32)
y = np.random.random((32, 32)).astype(np.float32)
""")
print t

这在我的机器上需要6.9秒...

将此与fftconvolve

进行比较

t = timeit.timeit(stmt="signal.fftconvolve(x, y, mode='same')", number=1,
setup="""
import numpy as np
from scipy import signal
x = np.random.random((2048, 2048)).astype(np.float32)
y = np.random.random((32, 32)).astype(np.float32)
""")
print t

这大约需要10.8秒.但是,在输入大小不同的情况下,使用fft进行卷积可能会更快(尽管目前看来我似乎还没有一个很好的例子……).

Several users have asked about the speed or memory consumption of image convolutions in numpy or scipy [1, 2, 3, 4]. From the responses and my experience using Numpy, I believe this may be a major shortcoming of numpy compared to Matlab or IDL.

None of the answers so far have addressed the overall question, so here it is: "What is the fastest method for computing a 2D convolution in Python?" Common python modules are fair game: numpy, scipy, and PIL (others?). For the sake of a challenging comparison, I'd like to propose the following rules:

  1. Input matrices are 2048x2048 and 32x32, respectively.
  2. Single or double precision floating point are both acceptable.
  3. Time spent converting your input matrix to the appropriate format doesn't count -- just the convolution step.
  4. Replacing the input matrix with your output is acceptable (does any python library support that?)
  5. Direct DLL calls to common C libraries are alright -- lapack or scalapack
  6. PyCUDA is right out. It's not fair to use your custom GPU hardware.

解决方案

It really depends on what you want to do... A lot of the time, you don't need a fully generic (read: slower) 2D convolution... (i.e. If the filter is separable, you use two 1D convolutions instead... This is why the various scipy.ndimage.gaussian, scipy.ndimage.uniform, are much faster than the same thing implemented as a generic n-D convolutions.)

At any rate, as a point of comparison:

t = timeit.timeit(stmt='ndimage.convolve(x, y, output=x)', number=1,
setup="""
import numpy as np
from scipy import ndimage
x = np.random.random((2048, 2048)).astype(np.float32)
y = np.random.random((32, 32)).astype(np.float32)
""")
print t

This takes 6.9 sec on my machine...

Compare this with fftconvolve

t = timeit.timeit(stmt="signal.fftconvolve(x, y, mode='same')", number=1,
setup="""
import numpy as np
from scipy import signal
x = np.random.random((2048, 2048)).astype(np.float32)
y = np.random.random((32, 32)).astype(np.float32)
""")
print t

This takes about 10.8 secs. However, with different input sizes, using fft's to do a convolution can be considerably faster (Though I can't seem to come up with a good example, at the moment...).

这篇关于Python中最快的2D卷积或图像过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆