在python中快速找到峰值并进行质心控制 [英] Fast peak-finding and centroiding in python

查看:1863
本文介绍了在python中快速找到峰值并进行质心控制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在python中开发一个快速算法,用于在图像中找到峰值,然后找到这些峰值的质心。我使用scipy.ndimage.label和ndimage.find_objects编写了以下代码来定位对象。这似乎是代码中的瓶颈,在500x500图像中定位20个对象大约需要7毫秒。我想将其扩展到更大的(2000x2000)图像,但随后时间增加到几乎100毫秒。所以,我想知道是否有更快的选择。

I am trying to develop a fast algorithm in python for finding peaks in an image and then finding the centroid of those peaks. I have written the following code using the scipy.ndimage.label and ndimage.find_objects for locating the objects. This seems to be the bottleneck in the code, and it takes about 7 ms to locate 20 objects in a 500x500 image. I would like to scale this up to larger (2000x2000) image, but then the time increases to almost 100 ms. So, I'm wondering if there is a faster option.

这是我到目前为止的代码,它有效,但速度很慢。首先,我使用一些高斯峰来模拟我的数据。这部分很慢,但实际上我会使用真实数据,所以我不太关心加速那部分。我希望能够很快找到峰值。

Here is the code that I have so far, which works, but is slow. First I simulate my data using some gaussian peaks. This part is slow, but in practice I will be using real data, so I don't care too much about speeding that part up. I would like to be able to find the peaks very quickly.

import time
import numpy as np
import matplotlib.pyplot as plt
import scipy.ndimage
import matplotlib.patches 

plt.figure(figsize=(10,10))
ax1 = plt.subplot(221)
ax2 = plt.subplot(222)
ax3 = plt.subplot(223)
ax4 = plt.subplot(224)

size        = 500 #width and height of image in pixels
peak_height = 100 # define the height of the peaks
num_peaks   = 20
noise_level = 50
threshold   = 60

np.random.seed(3)

#set up a simple, blank image (Z)
x = np.linspace(0,size,size)
y = np.linspace(0,size,size)

X,Y = np.meshgrid(x,y)
Z = X*0

#now add some peaks
def gaussian(X,Y,xo,yo,amp=100,sigmax=4,sigmay=4):
    return amp*np.exp(-(X-xo)**2/(2*sigmax**2) - (Y-yo)**2/(2*sigmay**2))

for xo,yo in size*np.random.rand(num_peaks,2):
    widthx = 5 + np.random.randn(1)
    widthy = 5 + np.random.randn(1)
    Z += gaussian(X,Y,xo,yo,amp=peak_height,sigmax=widthx,sigmay=widthy)

#of course, add some noise:
Z = Z + scipy.ndimage.gaussian_filter(0.5*noise_level*np.random.rand(size,size),sigma=5)    
Z = Z + scipy.ndimage.gaussian_filter(0.5*noise_level*np.random.rand(size,size),sigma=1)    

t = time.time() #Start timing the peak-finding algorithm

#Set everything below the threshold to zero:
Z_thresh = np.copy(Z)
Z_thresh[Z_thresh<threshold] = 0
print 'Time after thresholding: %.5f seconds'%(time.time()-t)

#now find the objects
labeled_image, number_of_objects = scipy.ndimage.label(Z_thresh)
print 'Time after labeling: %.5f seconds'%(time.time()-t)

peak_slices = scipy.ndimage.find_objects(labeled_image)
print 'Time after finding objects: %.5f seconds'%(time.time()-t)

def centroid(data):
    h,w = np.shape(data)   
    x = np.arange(0,w)
    y = np.arange(0,h)

    X,Y = np.meshgrid(x,y)

    cx = np.sum(X*data)/np.sum(data)
    cy = np.sum(Y*data)/np.sum(data)

    return cx,cy

centroids = []

for peak_slice in peak_slices:
    dy,dx  = peak_slice
    x,y = dx.start, dy.start
    cx,cy = centroid(Z_thresh[peak_slice])
    centroids.append((x+cx,y+cy))

print 'Total time: %.5f seconds\n'%(time.time()-t)

###########################################
#Now make the plots:
for ax in (ax1,ax2,ax3,ax4): ax.clear()
ax1.set_title('Original image')
ax1.imshow(Z,origin='lower')

ax2.set_title('Thresholded image')
ax2.imshow(Z_thresh,origin='lower')

ax3.set_title('Labeled image')
ax3.imshow(labeled_image,origin='lower') #display the color-coded regions

for peak_slice in peak_slices:  #Draw some rectangles around the objects
    dy,dx  = peak_slice
    xy     = (dx.start, dy.start)
    width  = (dx.stop - dx.start + 1)
    height = (dy.stop - dy.start + 1)
    rect = matplotlib.patches.Rectangle(xy,width,height,fc='none',ec='red')
    ax3.add_patch(rect,)

ax4.set_title('Centroids on original image')
ax4.imshow(Z,origin='lower')

for x,y in centroids:
    ax4.plot(x,y,'kx',ms=10)

ax4.set_xlim(0,size)
ax4.set_ylim(0,size)

plt.tight_layout
plt.show()

结果尺寸= 500:

The results for size=500:

编辑:如果峰的数量很大(~100)并且图像的大小很小,则瓶颈实际上是质心部分。所以,也许这部分的速度也需要优化。

If the number of peaks is large (~100) and the size of the image is small, then the bottleneck is actually the centroiding part. So, perhaps the speed of this part also needs to be optimized.

推荐答案

你找到峰值的方法(简单的阈值处理)是当然对阈值的选择非常敏感:设置得太低,你会检测不是峰值的东西;设置得太高,你会错过有效的峰值。

Your method for finding the peaks (simple thresholding) is of course very sensitive to the choice of threshold: set it too low and you'll "detect" things that are not peaks; set it too high and you'll miss valid peaks.

还有更强大的替代方案,可以检测图像强度中的所有局部最大值,无论其强度值如何。我首选的是使用小(5x5或7x7)结构元素进行扩张,然后找到原始图像及其扩张版本具有相同值的像素。这是有效的,因为根据定义,膨胀(x,y,E,img)= {以像素(x,y)为中心的E中的img的最大值},因此膨胀(x,y,E,img)= img(x ,y)每当(x,y)是E标度处的局部最大值的位置时。

There are more robust alternatives, that will detect all the local maxima in the image intensity regardless of their intensity value. My preferred one is applying a dilation with a small (5x5 or 7x7) structuring element, then find the pixels where the original image and its dilated version have the same value. This works because, by definition, dilation(x, y, E, img) = { max of img within E centered at pixel (x,y) }, and therefore dilation(x, y, E, img) = img(x, y) whenever (x,y) is the location of a local maximum at the scale of E.

快速实现形态运算符(例如, OpenCV)这个算法在空间和时间上都是线性的图像大小(一个额外的图像大小的缓冲区用于扩张图像,一个传递两者)。在一个紧要关头,它也可以在线实现,没有额外的缓冲和一点额外的复杂性,它仍然是线性时间。

With a fast implementation of the morphological operators (e.g. the one in OpenCV) this algorithm is linear in the size of the image in both space and time (one extra image-sized buffer for the dilated image, and one pass on both). In a pinch, it can also be implemented on-line without the extra buffer and a little extra complexity, and it's still linear time.

进一步证明它存在盐和胡椒或类似的噪音,可能会引入许多假最大值,您可以应用该方法两次,使用不同大小的结构元素(例如,5x5和7x7),然后只保留稳定的最大值,其中可以定义稳定性通过不改变最大值的位置,或通过不改变多于一个像素的位置等。此外,当您有理由相信它们是由噪声引起时,您可能希望抑制较低的附近最大值。一种有效的方法是首先检测上面的所有局部最大值,按高度降序排序,然后沿着排序列表向下,如果图像中的值没有改变则保留它们,如果它们被保留,则设置为将它们中的(2d + 1)x(2d + 1)邻域中的所有像素归零,其中d是您愿意容忍的附近最大值之间的最小距离。

To further robustify it in the presence of salt-and-pepper or similar noise, which may introduce many false maxima, you can apply the method twice, with structuring elements of different size (say, 5x5 and 7x7), then retain only the stable maxima, where stability can be defined by unchanging position of the maxima, or by position not changing by more than one pixel, etc. Additionally, you may want to suppress low nearby maxima when you have reason to believe they are due to noise. An efficient way to do this is to first detect all the local maxima as above, sort them descending by height, then go down the sorted list and keep them if their value in the image has not changed and, if they are kept, set to zero all the pixels in a (2d+1) x (2d+1) neighborhood of them, where d is the min distance between nearby maxima that you are willing to tolerate.

这篇关于在python中快速找到峰值并进行质心控制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆