算法的改进可口可乐能形状识别 [英] Algorithm improvement for Coca-Cola can shape recognition

查看:293
本文介绍了算法的改进可口可乐能形状识别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个我已经在过去几年的工作,因为我还是个学生中最有趣的项目,是关于图像处理的最终项目。我们的目标是开发一个系统,能够识别可口可乐的(请注意,我强调这个词罐,你就会明白为什么在一分钟内)。你可以看到一个样本下面,用绿色矩形缩放和旋转的认可即可。

One of the most interesting projects I've worked in the past couple years as I was still a student, was a final project about image processing. The goal was to develop a system to be able to recognize Coca-Cola cans (note that I'm stressing the word cans, you'll see why in a minute). You can see a sample below, with the can recognized in the green rectangle with scale and rotation.

该项目的一些约束上:

  • 的背景可能是非常嘈杂。
  • 罐可以具有任何规模或旋转,甚至方向(在合理限度内)
  • 图像可以有一定程度的fuziness的(轮廓可能是不是真的直)
  • 可能有可口可乐瓶的图像中,并且该算法应仅检测所述罐<!/ LI>
  • 图像可能有很大的差异的亮度(所以你不能靠太多色彩检测功能。
  • 罐可能被部分隐藏在两侧或中间的(也可能部分隐藏在背后的瓶子!)
  • 有可能是没有罐在所有的形象,在这种情况下,你必须找到什么,写一条消息说这样。
  • The background could be very noisy.
  • The can could have any scale or rotation or even orientation (within reasonable limits)
  • The image could have some degree of fuziness (contours could be not really straight)
  • There could be Coca-Cola bottles in the image, and the algorithm should only detect the can !
  • The brightness of the image could vary a lot (so you can't rely "too much" on color detection.
  • The can could be partly hidden on the sides or the middle (and possibly partly hidden behind the bottle !)
  • There could be no cans at all in the image, in which case you had to find nothing and write a message saying so.

所以,你可以像这样棘手的事情(在这种情况下有我的算法完全失败)结束:

So you could end up with tricky things like this (which in this case had my algorithm totally fail):

现在我做这个项目很明显,因为这是前一段时间,并有很多的乐趣这样做,我有一个体面的实现。以下是关于我的实现的一些细节:

Now I've done this project obviously as it was a while ago, and had a lot of fun doing it, and I had a decent implementation. Here are some details about my implementation:

语言:完成在C ++中使用OpenCV库

Language: Done in C++ using OpenCV library.

pre-处理:对于像pre-处理我的意思是如何把它在一个更原始的形式给予的算法。我用了2种方法:

Pre-processing: Regarding image pre-processing I mean how to transform it in a more raw form to give to the algorithm. I used 2 methods:

  1. 更改颜色域从RGB到HSV(色相饱和值的)和过滤的基础上红色调,饱和度以上一定的阈值,以避免橘子般的色彩,低价值的过滤,以避免暗色调。最终的结果是一个二进制黑白图像,其中所有白色像素将重新present匹配该阈值的像素。显然还有很多掷骰子的图像中的,但是这会降低你必须一起工作)的维数。
  2. 噪声使用中值滤波(此值取所有邻居的中值像素值和替换象素),以减少噪声滤波。
  3. 使用 Canny边缘检测过滤器获得所有项目的轮廓后2 precedent步骤。
  1. Changing color domain from RGB to HSV (Hue Saturation Value) and filtering based on "red" hue, saturation above a certain threshold to avoid orange-like colors, and filtering of low value to avoid dark tones. The end result was a binary black and white image, where all white pixels would represent the pixels that match this threshold. Obviously there is still a lot of crap in the image, but this reduces the number of dimensions you have to work with).
  2. Noise filtering using median filtering (taking the median pixel value of all neighbors and replace the pixel by this value) to reduce noise.
  3. Using Canny Edge Detection Filter to get the contours of all items after 2 precedent steps.

算法:这个算法本身我选择了这个任务,从这个(真棒)书取在特征提取并称为广义Hough变换(pretty的从常规Hough变换不同)。它基本上说的几件事情:

Algorithm: The algorithm itself I chose for this task was taken from this (awesome) book on feature extraction and called Generalized Hough Transform (pretty different from the regular Hough Transform). It basically says a few things:

  • 您可以描述在空间中的物体,而不知道它的解析式(这是这里的情况)。
  • 这是性能稳定的图像变形,如缩放和旋转,因为它基本上会测试你的图像的比例因子和旋转因子每一个组合。
  • 它使用一个基础模型(模板),该算法将学习。
  • 留在轮廓图像将票投给其他的像素,将理应成为中心(重力计)的对象,基于其从模型中了解到的每个像素。

在最后,你最终得到的选票的热图,例如在此罐的轮廓的所有像素将票投给它的重心,所以你有很多票在同一像素对应于中心,并且将看到在如下的热图的峰

In the end, you end up with a heat map of the votes, for example here all the pixels of the contour of the can will vote for its gravitational center, so you'll have a lot of votes in the same pixel corresponding to the center, and will see a peak in the heat map as below.

一旦你有一个简单的基于阈值的启发式可以给你的中心像素的位置,从中可以得出的缩放和旋转,然后绘制你的小矩形围绕它(最终缩放和旋转的因素显然会相对于原来的模板)。至少在理论上...

Once you have that, a simple threshold-based heuristic can give you the location of the center pixel, from which you can derive the scale and rotation and then plot your little rectangle around it (final scale and rotation factor will obviously be relative to your original template). In theory at least...

结果:现在,虽然这种方法在基本情况下的工作,这是严重缺乏在某些领域:

Results: Now, while this approach worked in the basic cases, it was severely lacking in some areas:

  • 非常慢!我不强调这还不够。几乎整整一天处理30个测试图像,显然是因为我对旋转和平移一个非常高的比例因子,因为一些罐头是非常小的需要。
  • 在它完全丧失,当瓶子在图像中,由于某种原因几乎总是发现瓶子,而不是能(可能是因为瓶子较大,从而有更多的像素,从而更多的选票)
  • 模糊图像也白搭,因为投票结束了在像素在围绕中心随机位置,因此具有非常嘈杂的热图结束。
  • 不变性的平移和旋转达到了,但不是在方向,这意味着,这不是直接面对镜头的目标一个可以不被认可。
  • It is extremely slow ! I'm not stressing this enough. Almost a full day was needed to process the 30 test images, obviously because I had a very high scaling factor for rotation and translation, since some of the cans were very small.
  • It was completely lost when bottles were in the image, and for some reason almost always found the bottle instead of the can (perhaps because bottles were bigger, thus had more pixels, thus more votes)
  • Fuzzy images were also no good, since the votes ended up in pixel at random locations around the center, thus ending with a very noisy heat map.
  • Invariance in translation and rotation was achieved, but not in orientation, meaning that a can that was not directly facing the camera objective wasn't recognized.

您可以帮助我提高我的特殊算法,用完全 OpenCV的功能,解决了四个具体问题mentionned?

Can you help me improve my specific algorithm, using exclusively OpenCV features, to resolve the four specific issues mentionned?

我希望有些人还会学到一些东西出来的为好,毕竟我觉得不仅是问问题的人谁应该学习:)

I hope some people will also learn something out of it as well, after all I think not only people who ask questions should learn :)

推荐答案

另一种方法是将提取的特征(关键点)使用的尺度不变特征转换(SIFT)或快速鲁棒特征(SURF)。

An alternative approach would be to extract features (keypoints) using the scale-invariant feature transform (SIFT) or Speeded Up Robust Features (SURF).

这是 OpenCV的实施的2.3.1。

您可以找到使用的 <一个功能一个不错的code为例href="http://opencv.itseez.com/doc/tutorials/features2d/feature_homography/feature_homography.html#feature-homography">Features2D +单应找到已知物体

You can find a nice code example using features in Features2D + Homography to find a known object

这两种算法是不变的缩放和旋转。由于他们具有特色的工作,你也可以处理闭塞(只要够关键点是可见的)。

Both algorithms are invariant to scaling and rotation. Since they work with features, you can also handle occlusion (as long as enough keypoints are visible).

图片来源:教程示例

的处理需要几百毫秒SIFT,SURF是有点快,但是它不适合于实时应用。 ORB使用FAST是弱有关旋转不变性。

The processing takes a few hundred ms for SIFT, SURF is bit faster, but it not suitable for real-time applications. ORB uses FAST which is weaker regarding rotation invariance.

这篇关于算法的改进可口可乐能形状识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆