如何从颜色推断形状的状态 [英] How to infer the state of a shape from colors
问题描述
- 我有Lego立方体形成
4x4
形状,并且我试图推断图像中区域的状态:
- I have Lego cubes forming
4x4
shape, and I'm trying to infer the status of a zone inside the image:
为空/满,颜色为黄色或蓝色.
empty/full and the color whether if yellow or Blue.
- 为简化工作,我添加了红色标记以定义形状的边框,因为相机有时会晃动.
- 这是我试图检测到的形状的清晰图像,由手机相机拍摄
- to simplify my work I have added red marker to define the border of the shape since the camera is shaking sometimes.
- Here is a clear image of the shape I'm trying to detect taken by my phone camera
- 我应该使用的侧面摄像头的形状如下:
- 将我的工作集中在工作区域上,我创建了一个遮罩:
- 到目前为止,我一直在尝试按颜色(没有HSV颜色空间的简单阈值)定位红色标记,如下所示:
(请注意,该图像不是我的输入图像,仅用于清楚地演示所需的形状).
( 这是我的输入图像)
import numpy as np
import matplotlib.pyplot as plt
import cv2
img = cv2.imread('sample.png')
RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask = cv2.imread('mask.png')
masked = np.minimum(RGB, mask)
masked[masked[...,1]>25] = 0
masked[masked[...,2]>25] = 0
masked = masked[..., 0]
masked = cv2.medianBlur(masked,5)
plt.imshow(masked, cmap='gray')
plt.show()
到目前为止,我已经发现了这些标记:
and I have spotted the markers so far:
但是我还是很困惑:
如何准确检测所需区域的外部边界以及红色标记内部的内部边界(每个Lego立方体(黄色-蓝色-绿色)边界)?.
在此先感谢您的友善建议.
thanks in advance for your kind advice.
推荐答案
我使用未失真的图像测试了这种方法.假设您具有校正后的相机图像,因此可以通过鸟瞰"看到乐高积木.看法.现在,想法是使用红色标记来估计中心矩形并裁剪图像的该部分.然后,当您知道每块砖的尺寸(并且它们是恒定的)时,您可以跟踪一个 grid 并提取网格中的每个 cell ,您可以计算一些 HSV-基于
的蒙版来估计每个网格上的主色,这样您就知道该空间是由黄色还是蓝色的砖块占据的,其中的空间是空的.
I tested this approach using your undistorted image. Suppose you have the rectified camera image, so you see the lego bricks through a "bird's eye" perspective. Now, the idea is to use the red markers to estimate a center rectangle and crop that portion of the image. Then, as you know each brick's dimensions (and they are constant) you can trace a grid and extract each cell of the grid, You can compute some HSV-based
masks to estimate the dominant color on each grid, and that way you know if the space is occupied by a yellow or blue brick, of it is empty.
这些步骤是:
- 获取红色标记 的 HSV 蒙版
- 使用每个标记通过每个标记的坐标估计中心矩形
- 裁剪中心矩形
- 将划分为
cells
的矩形-这是grid
- 在每个单元格上运行一系列基于
HSV的
麦克斯,并计算主色 - 标记每个具有主色的单元格
- Get an HSV mask of the red markers
- Use each marker to estimate the center rectangle through each marker's coordinates
- Crop the center rectangle
- Divide the rectangle into
cells
- this is thegrid
- Run a series of
HSV-based
maks on each cell and compute the dominant color - Label each cell with the dominant color
让我们看看代码:
# Importing cv2 and numpy:
import numpy as np
import cv2
# image path
path = "D://opencvImages//"
fileName = "Bg9iB.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Store a deep copy for results:
inputCopy = inputImage.copy()
# Convert the image to HSV:
hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
# The HSV mask values (Red):
lowerValues = np.array([127, 0, 95])
upperValues = np.array([179, 255, 255])
# Create the HSV mask
mask = cv2.inRange(hsvImage, lowerValues, upperValues)
第一部分非常简单.您设置 HSV
范围,并使用 cv2.inRange
获得目标颜色的二进制掩码.结果是:
The first part is very straightforward. You set the HSV
range and use cv2.inRange
to get a binary mask of the target color. This is the result:
我们可以使用某些 morphology
进一步改善二进制掩码.让我们应用一个带有较大的结构元素
和 10
迭代的 closing
.我们希望尽可能清楚地定义这些标记:
We can further improve the binary mask using some morphology
. Let's apply a closing
with a somewhat big structuring element
and 10
iterations. We want those markers as clearly defined as possible:
# Set kernel (structuring element) size:
kernelSize = 5
# Set operation iterations:
opIterations = 10
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
哪个产量:
非常好.现在,让我们在此蒙版上检测 contours
.我们将每个轮廓近似为一个边界框
,并存储其起点和尺寸.这样的想法是,尽管我们将检测每个轮廓,但我们不确定它们的顺序.我们可以稍后对该列表进行 sort
排序,并从左到右,从上到下获取每个边界框,以更好地估计中心矩形.让我们检测轮廓
:
Very nice. Now, let's detect contours
on this mask. We will approximate each contour to a bounding box
and store its starting point and dimensions. The idea being that, while we will detect every contour, we are not sure of their order. We can sort
this list later and get each bounding box from left to right, top to bottom to better estimate the central rectangle. Let's detect contours
:
# Create a deep copy, convert it to BGR for results:
maskCopy = mask.copy()
maskCopy = cv2.cvtColor(maskCopy, cv2.COLOR_GRAY2BGR)
# Find the big contours/blobs on the filtered image:
contours, hierarchy = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Bounding Rects are stored here:
boundRectsList = []
# Process each contour 1-1:
for i, c in enumerate(contours):
# Approximate the contour to a polygon:
contoursPoly = cv2.approxPolyDP(c, 3, True)
# Convert the polygon to a bounding rectangle:
boundRect = cv2.boundingRect(contoursPoly)
# Get the bounding rect's data:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# Estimate the bounding rect area:
rectArea = rectWidth * rectHeight
# Set a min area threshold
minArea = 100
# Filter blobs by area:
if rectArea > minArea:
#Store the rect:
boundRectsList.append(boundRect)
我还创建了蒙版图像的深层副本
以供进一步使用.主要是创建此图像,这是轮廓检测和边界框近似的结果:
I also created a deep copy
of the mask image for further use. Mainly to create this image, which is the result of the contour detection and bounding box approximation:
请注意,我已包含最小区域条件.我想忽略低于 minArea
所定义的某个阈值的噪声.好了,现在我们在 boundRectsList
变量中有了边界框.让我们使用 Y
坐标对此框进行排序:
Notice that I have included a minimum area condition. I want to ignore noise below a certain threshold defined by minArea
. Alright, now we have the bounding boxes in the boundRectsList
variable. Let's sort this boxes using the Y
coordinate:
# Sort the list based on ascending y values:
boundRectsSorted = sorted(boundRectsList, key=lambda x: x[1])
列表现在已排序,我们可以从左到右,从上到下枚举框.像这样:第一个行"->0、1,第二个行"->2、3
.现在,我们可以使用此信息定义大的中央矩形.我称这些为内点".注意,矩形被定义为所有边界框的函数.例如,它的左上起点是由边界框0
的右下终点(x和y)定义的.其宽度由边界框1
的左下x坐标定义,高度由边界框2
的最右y坐标定义.我将遍历每个边界框并提取它们的相关尺寸,以下列方式构造中心矩形:(左上角x,左上角y,宽度,高度)
.实现这一目标的方法不止一种.我更喜欢使用 dictionary
来获取相关数据.让我们看看:
The list is now sorted and we can enumerate the boxes from left to right, top to bottom. Like this: First "row" -> 0, 1, Second "Row" -> 2, 3
. Now, we can define the big, central, rectangle using this info. I call these "inner points". Notice the rectangle is defined as function of all the bounding boxes. For example, its top left starting point is defined by bounding box 0
's bottom right ending point (both x and y). Its width is defined by bounding box 1
's bottom left x coordinate, height is defined by bounding box 2
's rightmost y coordinate. I'm gonna loop through each bounding box and extract their relevant dimensions to construct the center rectangle in the following way: (top left x, top left y, width, height)
. There's more than one way yo achieve this. I prefer to use a dictionary
to get the relevant data. Let's see:
# Rectangle dictionary:
# Each entry is an index of the currentRect list
# 0 - X, 1 - Y, 2 - Width, 3 - Height
# Additionally: -1 is 0 (no dimension):
pointsDictionary = {0: (2, 3),
1: (-1, 3),
2: (2, -1),
3: (-1, -1)}
# Store center rectangle coordinates here:
centerRectangle = [None]*4
# Process the sorted rects:
rectCounter = 0
for i in range(len(boundRectsSorted)):
# Get sorted rect:
currentRect = boundRectsSorted[i]
# Get the bounding rect's data:
rectX = currentRect[0]
rectY = currentRect[1]
rectWidth = currentRect[2]
rectHeight = currentRect[3]
# Draw sorted rect:
cv2.rectangle(maskCopy, (int(rectX), int(rectY)), (int(rectX + rectWidth),
int(rectY + rectHeight)), (0, 255, 0), 5)
# Get the inner points:
currentInnerPoint = pointsDictionary[i]
borderPoint = [None]*2
# Check coordinates:
for p in range(2):
# Check for '0' index:
idx = currentInnerPoint[p]
if idx == -1:
borderPoint[p] = 0
else:
borderPoint[p] = currentRect[idx]
# Draw the border points:
color = (0, 0, 255)
thickness = -1
centerX = rectX + borderPoint[0]
centerY = rectY + borderPoint[1]
radius = 50
cv2.circle(maskCopy, (centerX, centerY), radius, color, thickness)
# Mark the circle
org = (centerX - 20, centerY + 20)
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(maskCopy, str(rectCounter), org, font,
2, (0, 0, 0), 5, cv2.LINE_8)
# Show the circle:
cv2.imshow("Sorted Rects", maskCopy)
cv2.waitKey(0)
# Store the coordinates into list
if rectCounter == 0:
centerRectangle[0] = centerX
centerRectangle[1] = centerY
else:
if rectCounter == 1:
centerRectangle[2] = centerX - centerRectangle[0]
else:
if rectCounter == 2:
centerRectangle[3] = centerY - centerRectangle[1]
# Increase rectCounter:
rectCounter += 1
此图像用红色圆圈显示每个内部点.从左到右,从上到下枚举每个圆圈.内点存储在 centerRectangle
列表中:
This image shows each inner point with a red circle. Each circle is enumerated from left to right, top to bottom. The inner points are stored in the centerRectangle
list:
如果您连接每个内部点,您将获得我们一直在寻找的中心矩形:
If you join each inner point you get the center rectangle we have been looking for:
# Check out the big rectangle at the center:
bigRectX = centerRectangle[0]
bigRectY = centerRectangle[1]
bigRectWidth = centerRectangle[2]
bigRectHeight = centerRectangle[3]
# Draw the big rectangle:
cv2.rectangle(maskCopy, (int(bigRectX), int(bigRectY)), (int(bigRectX + bigRectWidth),
int(bigRectY + bigRectHeight)), (0, 0, 255), 5)
cv2.imshow("Big Rectangle", maskCopy)
cv2.waitKey(0)
签出:
现在,仅裁剪原始图像的这一部分:
Now, just crop this portion of the original image:
# Crop the center portion:
centerPortion = inputCopy[bigRectY:bigRectY + bigRectHeight, bigRectX:bigRectX + bigRectWidth]
# Store a deep copy for results:
centerPortionCopy = centerPortion.copy()
这是图像的中央部分:
很酷,现在让我们创建网格.您知道,每个宽度
必须有 4
个砖块,并且每个 height
必须有 4
个砖块.我们可以使用此信息来划分图像.我将每个子图像或单元格存储在一个列表中.我还要估计每个单元的中心,以进行其他处理.这些也存储在列表中.让我们看一下过程:
Cool, now let's create the grid. You know that there must be 4
bricks per width
and 4
bricks per height
. We can divide the image using this info. I'm storing each sub-image, or cell, in a list. I'm also estimating each cell's center, for additional processing. These are stored in a list too. Let's see the procedure:
# Dive the image into a grid:
verticalCells = 4
horizontalCells = 4
# Cell dimensions
cellWidth = bigRectWidth / verticalCells
cellHeight = bigRectHeight / horizontalCells
# Store the cells here:
cellList = []
# Store cell centers here:
cellCenters = []
# Loop thru vertical dimension:
for j in range(verticalCells):
# Cell starting y position:
yo = j * cellHeight
# Loop thru horizontal dimension:
for i in range(horizontalCells):
# Cell starting x position:
xo = i * cellWidth
# Cell Dimensions:
cX = int(xo)
cY = int(yo)
cWidth = int(cellWidth)
cHeight = int(cellHeight)
# Crop current cell:
currentCell = centerPortion[cY:cY + cHeight, cX:cX + cWidth]
# into the cell list:
cellList.append(currentCell)
# Store cell center:
cellCenters.append((cX + 0.5 * cWidth, cY + 0.5 * cHeight))
# Draw Cell
cv2.rectangle(centerPortionCopy, (cX, cY), (cX + cWidth, cY + cHeight), (255, 255, 0), 5)
cv2.imshow("Grid", centerPortionCopy)
cv2.waitKey(0)
这是网格:
现在让我们分别处理每个单元格.当然,您可以在最后一个循环上处理每个单元格,但是我目前不寻求优化,我的首要任务是清晰度.我们需要生成一系列目标颜色为 HSV
的蒙版:黄色
,蓝色
和 green
(空).同样,我更喜欢用目标颜色实现 dictionary
.我将为每种颜色生成一个蒙版,并使用 cv2.countNonZero
计算白色像素的数量.同样,我设置了最低阈值.这次是 10
.借助此信息,我可以确定哪个遮罩生成了最大数量的白色像素,从而为我提供了主导色:
Let's now process each cell individually. Of course, you can process each cell on the last loop, but I'm not currently looking for optimization, clarity is my priority. We need to generate a series of HSV
masks with the target colors: yellow
, blue
and green
(empty). I prefer to, again, implement a dictionary
with the target colors. I'll generate a mask for each color and I'll count the number of white pixels using cv2.countNonZero
. Again, I set a minimum threshold. This time of 10
. With this info I can determine which mask generated the maximum number of white pixels, thus, giving me the dominant color:
# HSV dictionary - color ranges and color name:
colorDictionary = {0: ([93, 64, 21], [121, 255, 255], "blue"),
1: ([20, 64, 21], [30, 255, 255], "yellow"),
2: ([55, 64, 21], [92, 255, 255], "green")}
# Cell counter:
cellCounter = 0
for c in range(len(cellList)):
# Get current Cell:
currentCell = cellList[c]
# Convert to HSV:
hsvCell = cv2.cvtColor(currentCell, cv2.COLOR_BGR2HSV)
# Some additional info:
(h, w) = currentCell.shape[:2]
# Process masks:
maxCount = 10
cellColor = "None"
for m in range(len(colorDictionary)):
# Get current lower and upper range values:
currentLowRange = np.array(colorDictionary[m][0])
currentUppRange = np.array(colorDictionary[m][1])
# Create the HSV mask
mask = cv2.inRange(hsvCell, currentLowRange, currentUppRange)
# Get max number of target pixels
targetPixelCount = cv2.countNonZero(mask)
if targetPixelCount > maxCount:
maxCount = targetPixelCount
# Get color name from dictionary:
cellColor = colorDictionary[m][2]
# Get cell center, add an x offset:
textX = int(cellCenters[cellCounter][0]) - 100
textY = int(cellCenters[cellCounter][1])
# Draw text on cell's center:
font = cv2.FONT_HERSHEY_SIMPLEX
cv2.putText(centerPortion, cellColor, (textX, textY), font,
2, (0, 0, 255), 5, cv2.LINE_8)
# Increase cellCounter:
cellCounter += 1
cv2.imshow("centerPortion", centerPortion)
cv2.waitKey(0)
这是结果:
从这里很容易识别网格上的空白区域.我没有讲的是对失真图像的透视校正,但是有很多有关如何做到这一点的信息.希望这对您有所帮助!
From here it is easy to identify the empty spaces on the grid. What I didn't cover was the perspective rectification of your distorted image, but there's plenty of info on how to do that. Hope this helps you out!
如果要将这种方法应用于失真的图像,则需要消除鱼眼和透视失真.校正后的图像应如下所示:
If you want to apply this approach to your distorted image you need to undo the fish-eye and the perspective distortion. Your rectified image should look like this:
您可能不得不调整一些值,因为即使在校正后,某些失真仍然存在.
You probably will have to tweak some values because some of the distortion still remains, even after rectification.
这篇关于如何从颜色推断形状的状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!