如何检测圣诞树? [英] How to detect a Christmas Tree?

查看:137
本文介绍了如何检测圣诞树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

哪些图像处理技术可用于实现检测以下图片中显示的圣诞树的应用程序?











我正在寻找将在所有这些图像上工作的解决方案。因此,需要训练 haar级联分类器模板匹配的方法不是很有趣。



对于可以以任何编程语言编写的东西,只使用开源技术。解决方案必须使用在此问题上共享的图像进行测试。有 6个输入图像,答案应显示处理每个图像的结果。最后,对于每个输出图像,必须以红线绘制以包围检测到的树。



去编程检测这些图像中的树?

解决方案

我有一种方法,我认为是有趣的,休息。我的方法的主要区别,相比其他一些,是在如何执行图像分割步骤 - 我使用。对于DBSCAN通常相对于其他聚类算法的行为,有一个非常好的视觉比较,可供, scipy matplotlib scikit-learn 。我把它分成两部分。第一部分负责实际的图像处理:

 来自PIL import Image 
import numpy as np
import scipy as sp
import matplotlib.colors as colors
从sklearn.cluster导入DBSCAN
从数学import ceil,sqrt


输入:

rgbimg:[M,N,3]包含(uint,0-255)彩色图像的numpy数组

hueleftthr:选择最大允许色调
黄绿色区域

huerightthr:在
蓝色紫色区域中选择最小允许色调的标量常数

satthr:选择最小值的标量常量允许饱和

valthr:选择最小允许值的标量常数

monothr:选择最小允许单色的标量常数
亮度

maxpoints :标量常数最大像素数量转发到
DBSCAN聚类算法

proxthresh:用于DBSCAN的接近阈值,作为
的一部分图像的对角线大小

输出:

borderseg:[K,2,2]包含K对x-和y-像素的嵌套列表
绘制树形边框的值

X:[P,2]通过阈值步骤的像素列表

标签:[Q,2] Xslice中点的集群标签列表下面)

Xslice:[Q,2]要传递给DBSCAN的减少的像素列表



def findtree(rgbimg, hueleftthr = 0.2,huerightthr = 0.95,satthr = 0.7,
valthr = 0.7,monothr = 220,maxpoints = 5000,proxthresh = 0.04):

#将$ rgb图像转换为单色$ b gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
#将rgb图像(uint,0-255)转换为hsv(float,0.0-1.0)
hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/ 255)

#初始化二进制阈值映像
binimg = np.zeros((rgbimg.shape [0],rgbimg.shape [ 1]))
#查找hue <0.2或hue> 0.95(红色或黄色)和饱和度/值
#大于0.7(饱和和亮)的像素 - 趋向于与
#在某些图像中的树上观赏灯
boolidx = np.logical_and(
np.logical_and(
np.logical_or((hsvimg [:,:,0] (hsvimg [:,:1]> satthr),
(hsvimg [:,:,0]> huerightthr) :,2]> valthr))
#查找满足hsv标准的像素
binimg [np.where(boolidx)] = 255
#添加满足灰度亮度标准的像素
binimg [np.where(gryimg> monothr)] = 255

#为DBSCAN聚类算法准备阈值点
X = np.transpose(np.where(binimg == 255) )
Xslice = X
nsample = len(Xslice)
如果nsample> maxpoints:
#确保点数不超过DBSCAN最大容量
Xslice = X [range(0,nsample,int(ceil(float(nsample)/ maxpoints))]

#将DBSCAN接近阈值转换为像素单位并运行DBSCAN
pixproxthr = proxthresh * sqrt(binimg.shape [0] ** 2 + binimg.shape [1] ** 2)
db = DBSCAN(eps = pixproxthr,min_samples = 10).fit(Xslice)
labels = db.labels_.astype(int)

#查找最大的簇)和获得凸包
unique_labels = set(labels)
maxclustpt = 0
for unique_labels:
class_members = [index [0] for index in np.argwhere == k)]
if len(class_members)> maxclustpt:
points = Xslice [class_members]
hull = sp.spatial.ConvexHull(points)
maxclustpt = len(class_members)
borderseg = [[points [simplex,0] ,points [simplex,1]] for simplex
in hull.simplices]

return borderseg,X,labels,Xslice

,第二部分是用户级脚本,它调用第一个文件并生成上面的所有图:

 #!/ usr / bin / env python 

从PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree

#要处理的映像文件
fname = ['nmzwj.png','aVZhC.png', '2K9EF.png',
'YowlH.png','2y4o5.png','FWhSP.png']

#初始化数字
fgsz =(16,7)
figthresh = plt.figure(figsize = fgsz,facecolor ='w')
figclust = plt.figure(figsize = fgsz,facecolor ='w')
figcltwo = plt.figure figsize = fgsz,facecolor ='w')
figborder = plt.figure(figsize = fgsz,facecolor ='w')
figthresh.canvas.set_window_title('阈值HSV和单色亮度')
figclust.canvas.set_window_title('DBSCAN Clusters(Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters(Slyly Dilated for Display)')
figborder.canvas.set_window_title树与边界')

为ii,名称在zip(范围(len(fname)),fname):
#打开文件并转换为rgb image
rgbimg = np.asarray(Image.open(name))

#获取树边框以及一堆其他中间值
#将用于说明算法的工作原理
borderseg,X,labels,Xslice = findtree(rgbimg)

#显示阈值图像
axthresh = figthresh.add_subplot(2,3,ii + 1)
axthresh。 set_xticks([])
axthresh.set_yticks([])
binimg = np.zeros((rgbimg.shape [0],rgbimg.shape [1]))
for v,h in X:
binimg [v,h] = 255
axthresh.imshow(binimg,interpolation ='nearest',cmap ='Grays')

#显示颜色编码集群
axclust = figclust.add_subplot(2,3,ii + 1)#原始版本
axclust.set_xticks([])
axclust.set_yticks([])
axcltwo = figbtwo.add_subplot(2,3,ii + 1)#仅显示
axcltwo.set_xticks([])
axcltwo.set_yticks([])
axcltwo.imshow(binimg,
clustimg = np.ones(rgbimg.shape)
unique_labels = set(labels)
#为每个集群生成一个唯一的颜色
plcol = cm.rainbow_r(np.linspace(0,1,len(unique_labels)))
为lbl,pix在zip(标签,Xslice):
为col,unqlbl在zip ,unique_labels):
if lbl == unqlbl:
#集群标签-1表示没有集群成员资格;
#覆盖默认颜色为黑色
如果lbl == -1:
col = [0.0,0.0,0.0,1.0]
#ij的原始版本
范围(3):
clustimg [pix [0],pix [1],ij] = col [ij]
#只显示
axcltwo.plot pix [0],'o',markerfacecolor = col,
markersize = 1,markeredgecolor = col)
axclust.imshow(clustimg)
axcltwo.set_xlim(0,binimg.shape [1 ] -1
axcltwo.set_ylim(binimg.shape [0],-1)

#在树上绘制带有读取边框的原始图片
axborder = figborder.add_subplot对于vseg,hseg在bordereg:
axborder中,


,axborder.set_axis_off()
axborder.imshow(rgbimg,interpolation ='nearest' plot(hseg,vseg,'r-',lw = 3)
axborder.set_xlim(0,binimg.shape [1] -1)
axborder.set_ylim(binimg.shape [0] 1)

plt.show()


Which image processing techniques could be used to implement an application that detects the christmas trees displayed in the following images?

I'm searching for solutions that are going to work on all these images. Therefore, approaches that require training haar cascade classifiers or template matching are not very interesting.

I'm looking for something that can be written in any programming language, as long as it uses only Open Source technologies. The solution must be tested with the images that are shared on this question. There are 6 input images and the answer should display the results of processing each of them. Finally, for each output image there must be red lines draw to surround the detected tree.

How would you go about programmatically detecting the trees in these images?

解决方案

I have an approach which I think is interesting and a bit different from the rest. The main difference in my approach, compared to some of the others, is in how the image segmentation step is performed--I used the DBSCAN clustering algorithm from Python's scikit-learn; it's optimized for finding somewhat amorphous shapes that may not necessarily have a single clear centroid.

At the top level, my approach is fairly simple and can be broken down into about 3 steps. First I apply a threshold (or actually, the logical "or" of two separate and distinct thresholds). As with many of the other answers, I assumed that the Christmas tree would be one of the brighter objects in the scene, so the first threshold is just a simple monochrome brightness test; any pixels with values above 220 on a 0-255 scale (where black is 0 and white is 255) are saved to a binary black-and-white image. The second threshold tries to look for red and yellow lights, which are particularly prominent in the trees in the upper left and lower right of the six images, and stand out well against the blue-green background which is prevalent in most of the photos. I convert the rgb image to hsv space, and require that the hue is either less than 0.2 on a 0.0-1.0 scale (corresponding roughly to the border between yellow and green) or greater than 0.95 (corresponding to the border between purple and red) and additionally I require bright, saturated colors: saturation and value must both be above 0.7. The results of the two threshold procedures are logically "or"-ed together, and the resulting matrix of black-and-white binary images is shown below:

You can clearly see that each image has one large cluster of pixels roughly corresponding to the location of each tree, plus a few of the images also have some other small clusters corresponding either to lights in the windows of some of the buildings, or to a background scene on the horizon. The next step is to get the computer to recognize that these are separate clusters, and label each pixel correctly with a cluster membership ID number.

For this task I chose DBSCAN. There is a pretty good visual comparison of how DBSCAN typically behaves, relative to other clustering algorithms, available here. As I said earlier, it does well with amorphous shapes. The output of DBSCAN, with each cluster plotted in a different color, is shown here:

There are a few things to be aware of when looking at this result. First is that DBSCAN requires the user to set a "proximity" parameter in order to regulate its behavior, which effectively controls how separated a pair of points must be in order for the algorithm to declare a new separate cluster rather than agglomerating a test point onto an already pre-existing cluster. I set this value to be 0.04 times the size along the diagonal of each image. Since the images vary in size from roughly VGA up to about HD 1080, this type of scale-relative definition is critical.

Another point worth noting is that the DBSCAN algorithm as it is implemented in scikit-learn has memory limits which are fairly challenging for some of the larger images in this sample. Therefore, for a few of the larger images, I actually had to "decimate" (i.e., retain only every 3rd or 4th pixel and drop the others) each cluster in order to stay within this limit. As a result of this culling process, the remaining individual sparse pixels are difficult to see on some of the larger images. Therefore, for display purposes only, the color-coded pixels in the above images have been effectively "dilated" just slightly so that they stand out better. It's purely a cosmetic operation for the sake of the narrative; although there are comments mentioning this dilation in my code, rest assured that it has nothing to do with any calculations that actually matter.

Once the clusters are identified and labeled, the third and final step is easy: I simply take the largest cluster in each image (in this case, I chose to measure "size" in terms of the total number of member pixels, although one could have just as easily instead used some type of metric that gauges physical extent) and compute the convex hull for that cluster. The convex hull then becomes the tree border. The six convex hulls computed via this method are shown below in red:

The source code is written for Python 2.7.6 and it depends on numpy, scipy, matplotlib and scikit-learn. I've divided it into two parts. The first part is responsible for the actual image processing:

from PIL import Image
import numpy as np
import scipy as sp
import matplotlib.colors as colors
from sklearn.cluster import DBSCAN
from math import ceil, sqrt

"""
Inputs:

    rgbimg:         [M,N,3] numpy array containing (uint, 0-255) color image

    hueleftthr:     Scalar constant to select maximum allowed hue in the
                    yellow-green region

    huerightthr:    Scalar constant to select minimum allowed hue in the
                    blue-purple region

    satthr:         Scalar constant to select minimum allowed saturation

    valthr:         Scalar constant to select minimum allowed value

    monothr:        Scalar constant to select minimum allowed monochrome
                    brightness

    maxpoints:      Scalar constant maximum number of pixels to forward to
                    the DBSCAN clustering algorithm

    proxthresh:     Proximity threshold to use for DBSCAN, as a fraction of
                    the diagonal size of the image

Outputs:

    borderseg:      [K,2,2] Nested list containing K pairs of x- and y- pixel
                    values for drawing the tree border

    X:              [P,2] List of pixels that passed the threshold step

    labels:         [Q,2] List of cluster labels for points in Xslice (see
                    below)

    Xslice:         [Q,2] Reduced list of pixels to be passed to DBSCAN

"""

def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7, 
             valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04):

    # Convert rgb image to monochrome for
    gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
    # Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0)
    hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255)

    # Initialize binary thresholded image
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    # Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value
    # both greater than 0.7 (saturated and bright)--tends to coincide with
    # ornamental lights on trees in some of the images
    boolidx = np.logical_and(
                np.logical_and(
                  np.logical_or((hsvimg[:,:,0] < hueleftthr),
                                (hsvimg[:,:,0] > huerightthr)),
                                (hsvimg[:,:,1] > satthr)),
                                (hsvimg[:,:,2] > valthr))
    # Find pixels that meet hsv criterion
    binimg[np.where(boolidx)] = 255
    # Add pixels that meet grayscale brightness criterion
    binimg[np.where(gryimg > monothr)] = 255

    # Prepare thresholded points for DBSCAN clustering algorithm
    X = np.transpose(np.where(binimg == 255))
    Xslice = X
    nsample = len(Xslice)
    if nsample > maxpoints:
        # Make sure number of points does not exceed DBSCAN maximum capacity
        Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))]

    # Translate DBSCAN proximity threshold to units of pixels and run DBSCAN
    pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2)
    db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice)
    labels = db.labels_.astype(int)

    # Find the largest cluster (i.e., with most points) and obtain convex hull   
    unique_labels = set(labels)
    maxclustpt = 0
    for k in unique_labels:
        class_members = [index[0] for index in np.argwhere(labels == k)]
        if len(class_members) > maxclustpt:
            points = Xslice[class_members]
            hull = sp.spatial.ConvexHull(points)
            maxclustpt = len(class_members)
            borderseg = [[points[simplex,0], points[simplex,1]] for simplex
                          in hull.simplices]

    return borderseg, X, labels, Xslice

and the second part is a user-level script which calls the first file and generates all of the plots above:

#!/usr/bin/env python

from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree

# Image files to process
fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png',
         'YowlH.png', '2y4o5.png', 'FWhSP.png']

# Initialize figures
fgsz = (16,7)        
figthresh = plt.figure(figsize=fgsz, facecolor='w')
figclust  = plt.figure(figsize=fgsz, facecolor='w')
figcltwo  = plt.figure(figsize=fgsz, facecolor='w')
figborder = plt.figure(figsize=fgsz, facecolor='w')
figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness')
figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)')
figborder.canvas.set_window_title('Trees with Borders')

for ii, name in zip(range(len(fname)), fname):
    # Open the file and convert to rgb image
    rgbimg = np.asarray(Image.open(name))

    # Get the tree borders as well as a bunch of other intermediate values
    # that will be used to illustrate how the algorithm works
    borderseg, X, labels, Xslice = findtree(rgbimg)

    # Display thresholded images
    axthresh = figthresh.add_subplot(2,3,ii+1)
    axthresh.set_xticks([])
    axthresh.set_yticks([])
    binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
    for v, h in X:
        binimg[v,h] = 255
    axthresh.imshow(binimg, interpolation='nearest', cmap='Greys')

    # Display color-coded clusters
    axclust = figclust.add_subplot(2,3,ii+1) # Raw version
    axclust.set_xticks([])
    axclust.set_yticks([])
    axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only
    axcltwo.set_xticks([])
    axcltwo.set_yticks([])
    axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys')
    clustimg = np.ones(rgbimg.shape)    
    unique_labels = set(labels)
    # Generate a unique color for each cluster 
    plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels)))
    for lbl, pix in zip(labels, Xslice):
        for col, unqlbl in zip(plcol, unique_labels):
            if lbl == unqlbl:
                # Cluster label of -1 indicates no cluster membership;
                # override default color with black
                if lbl == -1:
                    col = [0.0, 0.0, 0.0, 1.0]
                # Raw version
                for ij in range(3):
                    clustimg[pix[0],pix[1],ij] = col[ij]
                # Dilated just for display
                axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col, 
                    markersize=1, markeredgecolor=col)
    axclust.imshow(clustimg)
    axcltwo.set_xlim(0, binimg.shape[1]-1)
    axcltwo.set_ylim(binimg.shape[0], -1)

    # Plot original images with read borders around the trees
    axborder = figborder.add_subplot(2,3,ii+1)
    axborder.set_axis_off()
    axborder.imshow(rgbimg, interpolation='nearest')
    for vseg, hseg in borderseg:
        axborder.plot(hseg, vseg, 'r-', lw=3)
    axborder.set_xlim(0, binimg.shape[1]-1)
    axborder.set_ylim(binimg.shape[0], -1)

plt.show()

这篇关于如何检测圣诞树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆