如何提高sobel边缘检测器的效率 [英] How to improve the efficiency of a sobel edge detector

查看:101
本文介绍了如何提高sobel边缘检测器的效率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用Python从头开始编写计算机视觉库,以与rpi相机一起使用.目前,我已经实现了向greyscale的转换以及一些其他基本的img操作,它们都在我的model B rpi3上相对较快地运行.

I am writing a computer vision library from scratch in Python to work with a rpi camera. At the moment, I have implemented conversion to greyscale and some other basic img operations which both run relatively fast on my model B rpi3.

但是,我使用sobel运算符(维基百科描述)进行边缘检测的功能是尽管它确实起作用,但比其他功能慢得多.在这里:

However, my edge detection function using the sobel operator (wikipedia description) is much slower than the other functions although it does work. Here it is:

def sobel(img):
    xKernel = np.array([[-1,0,1],[-2,0,2],[-1,0,1]])
    yKernel = np.array([[-1,-2,-1],[0,0,0],[1,2,1]])
    sobelled = np.zeros((img.shape[0]-2, img.shape[1]-2, 3), dtype="uint8")
    for y in range(1, img.shape[0]-1):
        for x in range(1, img.shape[1]-1):
            gx = np.sum(np.multiply(img[y-1:y+2, x-1:x+2], xKernel))
            gy = np.sum(np.multiply(img[y-1:y+2, x-1:x+2], yKernel))
            g = abs(gx) + abs(gy) #math.sqrt(gx ** 2 + gy ** 2) (Slower)
            g = g if g > 0 and g < 255 else (0 if g < 0 else 255)
            sobelled[y-1][x-2] = g
    return sobelled

并使用这张greyscale猫的图片运行它:

and running it with this greyscale image of a cat:

我得到这个答复,这似乎是正确的:

I get this response, which seems correct:

该库的应用程序,尤其是此功能,是在下棋机器人上进行的,其中边缘检测将有助于识别棋子的位置.问题是运行需要>15秒,这是一个严重的问题,因为它会增加机器人大量移动的时间.

The application of the library, and this function in particular, is on a chess playing robot in which the edge detection will help to recognise the location of the pieces. The problem is that it takes >15 seconds to run which is a significant problem as it will add to the time the robot takes to make its move by a lot.

我的问题是:我如何加快速度?

到目前为止,我已经尝试了几件事:

So far, I have tried a couple of things:

  1. 而不是squaring,然后是adding,然后是square rooting,然后是gxgy值,以获得总梯度,我只是sum absolute值.这样可以将速度提高很多.

  1. Instead of squaring then adding, then square rooting the gx and gy values to get the total gradient, I just sum the absolute values. This improved the speed a decent amount.

使用来自rpi相机的较低的resolution图像.显然,这是使这些操作更快运行的简单方法,但是由于在480x360的最小可用分辨率下仍非常慢,因此远不是那么可行,这已经大大低于相机的3280x2464最大值.

Using a lower resolution image from the rpi camera. This obviously is a simple way to make these operations run faster, however its not really that viable as its still pretty slow at the minimum usable resolution of 480x360 which is massively reduced from the camera's max of 3280x2464.

编写嵌套的for循环以代替np.sum(np.multiply(...))来执行matrix convolutions.由于np.multiply返回一个新数组,这最终使 slower 稍微让我感到惊讶,因为我认为使用loops这样做应该会更快.我认为尽管这可能是由于numpy主要是用C编写的事实,或者实际上并未存储新数组,所以并不需要很长时间,但是我不太确定.

Writing nested for-loops to do the matrix convolutions in place of the np.sum(np.multiply(...)). This ended up being slightly slower which I was surprised by as since np.multiply returns a new array, I thought that it should have been faster to do it with loops. I think though that this may be due to the fact that numpy is mostly written in C or that the new array isn't actually stored so doesn't take a long time but I'm not too sure.

任何帮助将不胜感激-我认为最主要的改进是要点3,即matrix乘法和求和.

Any help would be much appreciated - I think the main thing for improvement is point 3, i.e the matrix multiplication and summing.

推荐答案

即使您正在构建自己的库,您也绝对应该使用库进行卷积,它们将在后端使用C或Fortran进行结果运算将会快很多.

Even though you're building out your own library, you really should absolutely use libraries for convolution, they will do the resulting operations in C or Fortran in the backend which will be much, much faster.

但是,如果需要,可以自己做,请使用线性可分离滤波器.这是主意:

But to do it yourself if you'd like, use linear separable filters. Here's the idea:

图片:

1 2 3 4 5
2 3 4 5 1
3 4 5 1 2

Sobel x内核:

-1 0 1
-2 0 2
-1 0 1

结果:

8, 3, -7

在卷积的第一个位置,您将计算9个值.首先,为什么?您永远不会添加中间列,不必费心将其相乘.但这不是线性可分离滤波器的重点.这个想法很简单.将内核放在第一个位置时,将第三列乘以[1, 2, 1].但是接下来的两步,您将第三列乘以[-1, -2, -1].真是浪费!您已经计算过了,现在只需要取反即可.这就是线性可分离滤波器的想法.请注意,您可以将过滤器分解为两个向量的矩阵外积:

At the first position of the convolution, you'll be computing 9 values. First off, why? You're never going to add the middle column, don't bother multiplying it. But that's not the point of linear separable filters. The idea is simple. When you place the kernel at the first position, you'll multiply the third column by [1, 2, 1]. But then two steps later, you'll multiply the third column by [-1, -2, -1]. What a waste! You've already calculated that, you just need to negate it now. And that's the idea with a linear separable filter. Notice that you can break up the filter into a matrix outer product of two vectors:

[1]
[2]  *  [-1, 0, 1]
[1]

在这里取外部乘积会产生相同的矩阵.因此,这里的想法是将操作分为两部分.首先将整个图像乘以行向量,然后乘以列向量.取行向量

Taking the outer product here yields the same matrix. So the idea here is to split the operation into two pieces. First multiply the whole image with the row vector, then the column vector. Taking the row vector

-1 0 1

在整个图像上,我们最终得到

across the image, we end up with

2  2  2
2  2 -3
2 -3 -3

然后将列向量传递给乘法和求和,我们再次得到

And then passing the column vector through to be multiplied and summed, we again get

8, 3, -7

另一个可能有用或可能没有帮助的漂亮技巧(取决于您在内存和效率之间的权衡):

One other nifty trick that may or may not be helpful (depends on your tradeoffs between memory and efficiency):

请注意,在单行乘法中,您忽略中间值,而只从左边的值中减去右边的值.这意味着您正在有效地执行以下操作减去这两个图像:

Note that in the single row multiplication, you ignore the middle value, and just subtract the right from the left values. This means that effectively what you are doing is subtracting these two images:

3 4 5     1 2 3
4 5 1  -  2 3 4
5 1 2     3 4 5

如果将图像的前两列剪掉,则会得到左矩阵,如果将后两列剪掉,则会得到右矩阵.因此,您只需简单地计算卷积的第一部分即可

If you cut the first two columns off of your image, you get the left matrix, and if you cut the last two columns off, you get the right matrix. So you can just compute this first part of the convolution simply as

result_h = img[:,2:] - img[:,:-2]

然后您可以遍历sobel运算符的其余列.或者,您甚至可以继续进行,然后做我们刚才做的相同的事情.这次是垂直情况,您只需要添加第一行和第三行,第二行加倍即可;或者,使用numpy加法:

And then you can loop through for the remaining column of the sobel operator. Or, you can even proceed further and do the same thing we just did. This time for the vertical case, you simply need to add the first and third row and twice the second row; or, using numpy addition:

result_v = result_h[:-2] + result_h[2:] + 2*result_h[1:-1]

您完成了!我可能会在不久的将来在此处添加一些时间.对于信封的某些计算(例如,在1000x1000图像上草草的Jupyter笔记本计时):

And you're done! I may add some timings here in the near-future. For some back of the envelope calculations (i.e. hasty Jupyter notebook timings on a 1000x1000 image):

新方法(图像的总和):每个循环8.18 ms±399 µs(平均值±标准偏差,共运行7次,每个循环100次)

new method (sums of the images): 8.18 ms ± 399 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

旧方法(双循环):每个循环7.32 s±207毫秒(平均±标准偏差,共运行7次,每个循环1次)

old method (double for-loop):7.32 s ± 207 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

是的,您没看错:1000倍加速.

Yes, you read that right: 1000x speedup.

下面是一些比较两者的代码:

Here's some code comparing the two:

import numpy as np

def sobel_x_orig(img):
    xKernel = np.array([[-1,0,1],[-2,0,2],[-1,0,1]])
    sobelled = np.zeros((img.shape[0]-2, img.shape[1]-2))
    for y in range(1, img.shape[0]-1):
        for x in range(1, img.shape[1]-1):
            sobelled[y-1, x-1] = np.sum(np.multiply(img[y-1:y+2, x-1:x+2], xKernel))
    return sobelled

def sobel_x_new(img):
    result_h = img[:,2:] - img[:,:-2]
    result_v = result_h[:-2] + result_h[2:] + 2*result_h[1:-1]
    return result_v

img = np.random.rand(1000, 1000)
sobel_new = sobel_x_new(img)
sobel_orig = sobel_x_orig(img)

assert (np.abs(sobel_new-sobel_orig) < 1e-12).all()

当然,1e-12的容忍度很严格,但这是针对每个元素的,因此应该可以.但是我还有一个float图像,对于uint8图像,您当然会有更大的差异.

Of course, 1e-12 is some serious tolerance, but this is per element so it should be OK. But also I have a float image, you'll of course have larger differences for uint8 images.

请注意,您可以对任何线性可分离滤波器执行此操作!包括高斯滤波器.还要注意,一般来说,这需要很多操作.在C或Fortran或任何其他语言中,通常仅将其实现为单行/列向量的两个卷积,因为最后,无论如何,它实际上需要遍历每个矩阵的每个元素.不管您是将它们相加还是相乘,因此在C中用这种方法添加图像值并不比卷积要快.但是遍历numpy数组非常慢,因此在Python中这种方法要快得多.

Note that you can do this for any linear separable filter! That includes Gaussian filters. Note also that in general, this requires a lot of operations. In C or Fortran or whatever, it's usually just implemented as two convolutions of the single row/column vectors because in the end, it needs to actually loop through every element of each matrix anyways; whether you're just adding them or multiplying them, so it's no faster in C to do it this way where you add the image values than if you just do the convolutions. But looping through numpy arrays is super slow, so this method is way faster in Python.

这篇关于如何提高sobel边缘检测器的效率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆