使用大步移动平均过滤器 [英] Using strides for an efficient moving average filter

查看:114
本文介绍了使用大步移动平均过滤器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近了解了大步回答这个职位,并想知道我可以使用它们来计算一个移动平均滤波器比我提出的更有效率这篇文章(使用卷积过滤器)。



这是我有至今。它采用原始数组的视图,然后将其滚动必要的数量,然后求和内核值以计算平均值。我意识到边缘处理不正确,但我可以照顾后,...有没有更好更快的方式?我们的目标是过滤大到5000x5000 x 16层的大浮点数组,一个 scipy.ndimage.filters.convolve 的任务相当慢。



请注意,我正在寻找8邻域连通性,即3x3滤镜需要9个像素(焦点像素周围8个)的平均值,并将该值分配给新的像素图像。

pre $ code $ import $ n $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ = 100).reshape((10,10))
b = numpy.lib.stride_tricks.as_strided(a,shape =(a.size,filtsize),strides =(a.itemsize,a.itemsize))
为我在范围内(0,filtsize-1):
如果我> 0:
b + = numpy.roll(b, - (pow(filtsize,2)+1)* i,0)
filtered =(numpy.sum(b,1)/ pow(filtsize, 2))。reshape((a.shape [0],a.shape [1]))
scipy.misc.imsave(average.jpg,过滤)



当前的代码:

编辑澄清我如何看到这个工作:


  1. 使用stride_tricks生成一个像[[0,1,2],[1,2,3],[2,3,4])的数组。 ..],它对应于过滤内核的第一行。
  2. 沿着纵轴滚动以获得内核的中间行[[10,11,12],[11, 12,13],[13,14,15] ...]并将其添加到我得到的数组中1)
  3. 重复获取内核的最下一行[[20 ,21,22],[21,22,23],[22,23,24] ...]。此时,我将每行的总和除以滤镜中元素的数量,给出每个像素的平均值,(移动1行1列,边缘有一些奇怪的,但我可以)

我所希望的是更好地使用stride_tricks来获得9个值或总和内核元素直接,整个数组,或者有人可以说服我另一个更有效的方法... ...

解决方案

什么是值得的,这是如何使用花哨的大步技巧做到这一点。我昨天要发布这个,但实际工作分心了! :)

@Paul& @eat都有很好的实现,使用各种其他的方式来做到这一点。只是为了继续前面的问题,我想我会张贴N维等价物。然而,你不能显着地击败>一维数组的 scipy.ndimage 函数。 ( scipy.ndimage.uniform_filter 应该击败 scipy.ndimage.convolve ,虽然)


此外,如果您尝试获得多维移动窗口,则无论何时您无意中创建阵列副本,都有可能导致内存使用量增加。虽然初始的滚动数组只是您原始数组内存的视图,但复制数组的任何中间步骤都会创建比您的原始数组大一个数量级的副本(即,让我们比如你正在使用一个100x100的原始数组...这个视图(对于(3,3)的过滤器大小)将是98x98x3x3,但是使用与原始数据相同的内存。但是,任何副本将使用数量的内存,一个完整的98x98x3x3数组将会!!)

基本上,使用疯狂的跨步技巧是非常适合当你想要矢量化移动窗口操作在一个ndarray的单个轴上。它使得很容易计算像一个移动的标准偏差等东西很少的开销。当你想开始沿着多个轴来做这件事的时候,这是可能的,但是你通常会用更专业化的功能来做更好的事情。 (比如 scipy.ndimage 等)

无论如何,下面是你如何做的:

  import numpy as np 

def rolling_window_lastaxis(a,window):
直接从Erik Rigtorp的帖子到numpy-discussion。
< http://www.mail-archive.com/numpy-discussion@scipy.org/msg29450.html>
if window< 1:
增加ValueError,window必须至少为1。
if window> a.shape [-1]:
增加ValueError,`window`太长了。
shape = a.shape [: - 1] +(a.shape [-1] - window + 1,window)
strides = a.strides +(a.strides [-1],)
return np.lib.stride_tricks.as_strided(a,shape = shape,strides = strides)
$ b $ def rolling_window(a,window):
如果不是hasattr(window,' __iter__'):
返回rolling_window_lastaxis(a,窗口)
给我,枚举(窗口):
如果win> 1:
a = a.swapaxes(i,-1)
a = rolling_window_lastaxis(a,win)
a = a.swapaxes(-2,i)
返回

filtsize =(3,3)
a = np.zeros((10,10),dtype = np.float)
a [5:7,5] = 1

b = rolling_window(a,filtsize)
blur = b.mean(axis = -1).mean(axis = -1)
b = rolling_window(a,filtsize)时,得到的是一个8x8x3x3数组,它实际上是一个视图与原来的10x10阵列相同的内存。我们可以很容易地沿着不同的坐标轴使用不同的滤波器大小,或者只能沿着一个N维数组的选定坐标轴(即 filtsize =(0,3,0,3)在一个4维数组将给我们一个6维视图)。



然后我们可以将一个任意函数重复应用到最后一个坐标轴上,以有效地计算移动窗口中的东西。 >然而,因为我们在每一步的平均值(或 std 或者其他什么),这根本不是记忆效率!



相当于 ndimage 只不过是:

  blurred = scipy.ndimage.uniform_filter(a,filtsize,output = a)

这将处理各种边界条件,在原地进行模糊处理,而不需要数组的临时副本,而且 >快。跨越技巧是沿着 轴向移动窗口应用函数的一种好方法,但是它们并不是沿着多个轴进行的一种好方法,通常是...

b
$ b

只要我的$ 0.02,无论如何...

I recently learned about strides in the answer to this post, and was wondering how I could use them to compute a moving average filter more efficiently than what I proposed in this post (using convolution filters).

This is what I have so far. It takes a view of the original array then rolls it by the necessary amount and sums the kernel values to compute the average. I am aware that the edges are not handled correctly, but I can take care of that afterward... Is there a better and faster way? The objective is to filter large floating point arrays up to 5000x5000 x 16 layers in size, a task that scipy.ndimage.filters.convolve is fairly slow at.

Note that I am looking for 8-neighbour connectivity, that is a 3x3 filter takes the average of 9 pixels (8 around the focal pixel) and assigns that value to the pixel in the new image.

import numpy, scipy

filtsize = 3
a = numpy.arange(100).reshape((10,10))
b = numpy.lib.stride_tricks.as_strided(a, shape=(a.size,filtsize), strides=(a.itemsize, a.itemsize))
for i in range(0, filtsize-1):
    if i > 0:
        b += numpy.roll(b, -(pow(filtsize,2)+1)*i, 0)
filtered = (numpy.sum(b, 1) / pow(filtsize,2)).reshape((a.shape[0],a.shape[1]))
scipy.misc.imsave("average.jpg", filtered)

EDIT Clarification on how I see this working:

Current code:

  1. use stride_tricks to generate an array like [[0,1,2],[1,2,3],[2,3,4]...] which corresponds to the top row of the filter kernel.
  2. Roll along the vertical axis to get the middle row of the kernel [[10,11,12],[11,12,13],[13,14,15]...] and add it to the array I got in 1)
  3. Repeat to get the bottom row of the kernel [[20,21,22],[21,22,23],[22,23,24]...]. At this point, I take the sum of each row and divide it by the number of elements in the filter, giving me the average for each pixel, (shifted by 1 row and 1 col, and with some oddities around edges, but I can take care of that later).

What I was hoping for is a better use of stride_tricks to get the 9 values or the sum of the kernel elements directly, for the entire array, or that someone can convince me of another more efficient method...

解决方案

For what it's worth, here's how you'd do it using "fancy" striding tricks. I was going to post this yesterday, but got distracted by actual work! :)

@Paul & @eat both have nice implementations using various other ways of doing this. Just to continue things from the earlier question, I figured I'd post the N-dimensional equivalent.

You're not going to be able to significantly beat scipy.ndimage functions for >1D arrays, however. (scipy.ndimage.uniform_filter should beat scipy.ndimage.convolve, though)

Moreover, if you're trying to get a multidimensional moving window, you risk having memory usage blow up whenever you inadvertently make a copy of your array. While the initial "rolling" array is just a view into the memory of your original array, any intermediate steps that copy the array will make a copy that is orders of magnitude larger than your original array (i.e. Let's say that you're working with a 100x100 original array... The view into it (for a filter size of (3,3)) will be 98x98x3x3 but use the same memory as the original. However, any copies will use the amount of memory that a full 98x98x3x3 array would!!)

Basically, using crazy striding tricks is great for when you want to vectorize moving window operations on a single axis of an ndarray. It makes it really easy to calculate things like a moving standard deviation, etc with very little overhead. When you want to start doing this along multiple axes, it's possible, but you're usually better off with more specialized functions. (Such as scipy.ndimage, etc)

At any rate, here's how you do it:

import numpy as np

def rolling_window_lastaxis(a, window):
    """Directly taken from Erik Rigtorp's post to numpy-discussion.
    <http://www.mail-archive.com/numpy-discussion@scipy.org/msg29450.html>"""
    if window < 1:
       raise ValueError, "`window` must be at least 1."
    if window > a.shape[-1]:
       raise ValueError, "`window` is too long."
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

def rolling_window(a, window):
    if not hasattr(window, '__iter__'):
        return rolling_window_lastaxis(a, window)
    for i, win in enumerate(window):
        if win > 1:
            a = a.swapaxes(i, -1)
            a = rolling_window_lastaxis(a, win)
            a = a.swapaxes(-2, i)
    return a

filtsize = (3, 3)
a = np.zeros((10,10), dtype=np.float)
a[5:7,5] = 1

b = rolling_window(a, filtsize)
blurred = b.mean(axis=-1).mean(axis=-1)

So what we get when we do b = rolling_window(a, filtsize) is an 8x8x3x3 array, that's actually a view into the same memory as the original 10x10 array. We could have just as easily used different filter size along different axes or operated only along selected axes of an N-dimensional array (i.e. filtsize = (0,3,0,3) on a 4-dimensional array would give us a 6 dimensional view).

We can then apply an arbitrary function to the last axis repeatedly to effectively calculate things in a moving window.

However, because we're storing temporary arrays that are much bigger than our original array on each step of mean (or std or whatever), this is not at all memory efficient! It's also not going to be terribly fast, either.

The equivalent for ndimage is just:

blurred = scipy.ndimage.uniform_filter(a, filtsize, output=a)

This will handle a variety of boundary conditions, do the "blurring" in-place without requiring a temporary copy of the array, and be very fast. Striding tricks are a good way to apply a function to a moving window along one axis, but they're not a good way to do it along multiple axes, usually....

Just my $0.02, at any rate...

这篇关于使用大步移动平均过滤器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆