如何从颜色推断形状的状态 [英] How to infer the state of a shape from colors

查看:46
本文介绍了如何从颜色推断形状的状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  • 我有Lego立方体形成 4x4 形状,并且我试图推断图像中区域的状态:
  • I have Lego cubes forming 4x4 shape, and I'm trying to infer the status of a zone inside the image:

为空/满,颜色为黄色或蓝色.

empty/full and the color whether if yellow or Blue.

  • 为简化工作,我添加了红色标记以定义形状的边框,因为相机有时会晃动.
  • 这是我试图检测到的形状的清晰图像,由手机相机拍摄
    • to simplify my work I have added red marker to define the border of the shape since the camera is shaking sometimes.
    • Here is a clear image of the shape I'm trying to detect taken by my phone camera
    • (请注意,该图像不是我的输入图像,仅用于清楚地演示所需的形状).

      • 我应该使用的侧面摄像头的形状如下:

      ( 这是我的输入图像)

      • 将我的工作集中在工作区域上,我创建了一个遮罩:

      • 到目前为止,我一直在尝试按颜色(没有HSV颜色空间的简单阈值)定位红色标记,如下所示:
      import numpy as np
      import matplotlib.pyplot as plt
      import cv2
      
      img = cv2.imread('sample.png')
      RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
      
      mask = cv2.imread('mask.png')
      masked = np.minimum(RGB, mask)
      
      masked[masked[...,1]>25] = 0
      masked[masked[...,2]>25] = 0
      masked = masked[..., 0]
      
      masked = cv2.medianBlur(masked,5)
      
      plt.imshow(masked, cmap='gray')
      plt.show()
      

      到目前为止,我已经发现了这些标记:

      and I have spotted the markers so far:

      但是我还是很困惑:

      如何准确检测所需区域的外部边界以及红色标记内部的内部边界(每个Lego立方体(黄色-蓝色-绿色)边界)?.

      在此先感谢您的友善建议.

      thanks in advance for your kind advice.

      推荐答案

      我使用未失真的图像测试了这种方法.假设您具有校正后的相机图像,因此可以通过鸟瞰"看到乐高积木.看法.现在,想法是使用红色标记来估计中心矩形并裁剪图像的该部分.然后,当您知道每块砖的尺寸(并且它们是恒定的)时,您可以跟踪一个 grid 并提取网格中的每个 cell ,您可以计算一些 HSV-基于的蒙版来估计每个网格上的主色,这样您就知道该空间是由黄色还是蓝色的砖块占据的,其中的空间是空的.

      I tested this approach using your undistorted image. Suppose you have the rectified camera image, so you see the lego bricks through a "bird's eye" perspective. Now, the idea is to use the red markers to estimate a center rectangle and crop that portion of the image. Then, as you know each brick's dimensions (and they are constant) you can trace a grid and extract each cell of the grid, You can compute some HSV-based masks to estimate the dominant color on each grid, and that way you know if the space is occupied by a yellow or blue brick, of it is empty.

      这些步骤是:

      1. 获取红色标记
      2. HSV 蒙版
      3. 使用每个标记通过每个标记的坐标估计中心矩形
      4. 裁剪中心矩形
      5. 划分为 cells 的矩形-这是 grid
      6. 在每个单元格上运行一系列基于 HSV的麦克斯,并计算主色
      7. 标记每个具有主色的单元格
      1. Get an HSV mask of the red markers
      2. Use each marker to estimate the center rectangle through each marker's coordinates
      3. Crop the center rectangle
      4. Divide the rectangle into cells - this is the grid
      5. Run a series of HSV-based maks on each cell and compute the dominant color
      6. Label each cell with the dominant color

      让我们看看代码:

      # Importing cv2 and numpy:
      import numpy as np
      import cv2
      
      # image path
      path = "D://opencvImages//"
      fileName = "Bg9iB.jpg"
      
      # Reading an image in default mode:
      inputImage = cv2.imread(path + fileName)
      # Store a deep copy for results:
      inputCopy = inputImage.copy()
      
      # Convert the image to HSV:
      hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
      
      # The HSV mask values (Red):
      lowerValues = np.array([127, 0, 95])
      upperValues = np.array([179, 255, 255])
      
      # Create the HSV mask
      mask = cv2.inRange(hsvImage, lowerValues, upperValues)
      

      第一部分非常简单.您设置 HSV 范围,并使用 cv2.inRange 获得目标颜色的二进制掩码.结果是:

      The first part is very straightforward. You set the HSV range and use cv2.inRange to get a binary mask of the target color. This is the result:

      我们可以使用某些 morphology 进一步改善二进制掩码.让我们应用一个带有较大的结构元素 10 迭代的 closing .我们希望尽可能清楚地定义这些标记:

      We can further improve the binary mask using some morphology. Let's apply a closing with a somewhat big structuring element and 10 iterations. We want those markers as clearly defined as possible:

      # Set kernel (structuring element) size:
      kernelSize = 5
      # Set operation iterations:
      opIterations = 10
      # Get the structuring element:
      maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
      # Perform closing:
      mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
      

      哪个产量:

      非常好.现在,让我们在此蒙版上检测 contours .我们将每个轮廓近似为一个边界框,并存储其起点和尺寸.这样的想法是,尽管我们将检测每个轮廓,但我们不确定它们的顺序.我们可以稍后对该列表进行 sort 排序,并从左到右,从上到下获取每个边界框,以更好地估计中心矩形.让我们检测轮廓:

      Very nice. Now, let's detect contours on this mask. We will approximate each contour to a bounding box and store its starting point and dimensions. The idea being that, while we will detect every contour, we are not sure of their order. We can sort this list later and get each bounding box from left to right, top to bottom to better estimate the central rectangle. Let's detect contours:

      # Create a deep copy, convert it to BGR for results:
      maskCopy = mask.copy()
      maskCopy = cv2.cvtColor(maskCopy, cv2.COLOR_GRAY2BGR)
      
      # Find the big contours/blobs on the filtered image:
      contours, hierarchy = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
      
      # Bounding Rects are stored here:
      boundRectsList = []
      
      # Process each contour 1-1:
      for i, c in enumerate(contours):
      
          # Approximate the contour to a polygon:
          contoursPoly = cv2.approxPolyDP(c, 3, True)
      
          # Convert the polygon to a bounding rectangle:
          boundRect = cv2.boundingRect(contoursPoly)
      
          # Get the bounding rect's data:
          rectX = boundRect[0]
          rectY = boundRect[1]
          rectWidth = boundRect[2]
          rectHeight = boundRect[3]
      
          # Estimate the bounding rect area:
          rectArea = rectWidth * rectHeight
      
          # Set a min area threshold
          minArea = 100
      
          # Filter blobs by area:
          if rectArea > minArea:
              #Store the rect:
              boundRectsList.append(boundRect)
      

      我还创建了蒙版图像的深层副本以供进一步使用.主要是创建此图像,这是轮廓检测和边界框近似的结果:

      I also created a deep copy of the mask image for further use. Mainly to create this image, which is the result of the contour detection and bounding box approximation:

      请注意,我已包含最小区域条件.我想忽略低于 minArea 所定义的某个阈值的噪声.好了,现在我们在 boundRectsList 变量中有了边界框.让我们使用 Y 坐标对此框进行排序:

      Notice that I have included a minimum area condition. I want to ignore noise below a certain threshold defined by minArea. Alright, now we have the bounding boxes in the boundRectsList variable. Let's sort this boxes using the Y coordinate:

      # Sort the list based on ascending y values:
      boundRectsSorted = sorted(boundRectsList, key=lambda x: x[1])
      

      列表现在已排序,我们可以从左到右,从上到下枚举框.像这样:第一个行"->0、1,第二个行"->2、3 .现在,我们可以使用此信息定义大的中央矩形.我称这些为内点".注意,矩形被定义为所有边界框的函数.例如,它的左上起点是由边界框0 的右下终点(x和y)定义的.其宽度由边界框1 的左下x坐标定义,高度由边界框2 的最右y坐标定义.我将遍历每个边界框并提取它们的相关尺寸,以下列方式构造中心矩形:(左上角x,左上角y,宽度,高度).实现这一目标的方法不止一种.我更喜欢使用 dictionary 来获取相关数据.让我们看看:

      The list is now sorted and we can enumerate the boxes from left to right, top to bottom. Like this: First "row" -> 0, 1, Second "Row" -> 2, 3. Now, we can define the big, central, rectangle using this info. I call these "inner points". Notice the rectangle is defined as function of all the bounding boxes. For example, its top left starting point is defined by bounding box 0's bottom right ending point (both x and y). Its width is defined by bounding box 1's bottom left x coordinate, height is defined by bounding box 2's rightmost y coordinate. I'm gonna loop through each bounding box and extract their relevant dimensions to construct the center rectangle in the following way: (top left x, top left y, width, height). There's more than one way yo achieve this. I prefer to use a dictionary to get the relevant data. Let's see:

      # Rectangle dictionary:
      # Each entry is an index of the currentRect list
      # 0 - X, 1 - Y, 2 - Width, 3 - Height
      # Additionally: -1 is 0 (no dimension):
      pointsDictionary = {0: (2, 3),
                          1: (-1, 3),
                          2: (2, -1),
                          3: (-1, -1)}
      
      # Store center rectangle coordinates here:
      centerRectangle = [None]*4
      
      # Process the sorted rects:
      rectCounter = 0
      
      for i in range(len(boundRectsSorted)):
      
          # Get sorted rect:
          currentRect = boundRectsSorted[i]
      
          # Get the bounding rect's data:
          rectX = currentRect[0]
          rectY = currentRect[1]
          rectWidth = currentRect[2]
          rectHeight = currentRect[3]
      
          # Draw sorted rect:
          cv2.rectangle(maskCopy, (int(rectX), int(rectY)), (int(rectX + rectWidth),
                                   int(rectY + rectHeight)), (0, 255, 0), 5)
      
          # Get the inner points:
          currentInnerPoint = pointsDictionary[i]
          borderPoint = [None]*2
      
          # Check coordinates:
          for p in range(2):
              # Check for '0' index:
              idx = currentInnerPoint[p]
              if idx == -1:
                  borderPoint[p] = 0
              else:
                  borderPoint[p] = currentRect[idx]
      
          # Draw the border points:
          color = (0, 0, 255)
          thickness = -1
          centerX = rectX + borderPoint[0]
          centerY = rectY + borderPoint[1]
          radius = 50
          cv2.circle(maskCopy, (centerX, centerY), radius, color, thickness)
      
          # Mark the circle
          org = (centerX - 20, centerY + 20)
          font = cv2.FONT_HERSHEY_SIMPLEX
          cv2.putText(maskCopy, str(rectCounter), org, font,
                  2, (0, 0, 0), 5, cv2.LINE_8)
      
          # Show the circle:
          cv2.imshow("Sorted Rects", maskCopy)
          cv2.waitKey(0)
      
          # Store the coordinates into list
          if rectCounter == 0:
              centerRectangle[0] = centerX
              centerRectangle[1] = centerY
          else:
              if rectCounter == 1:
                  centerRectangle[2] = centerX - centerRectangle[0]
              else:
                  if rectCounter == 2:
                      centerRectangle[3] = centerY - centerRectangle[1]
          # Increase rectCounter:
          rectCounter += 1
      

      此图像用红色圆圈显示每个内部点.从左到右,从上到下枚举每个圆圈.内点存储在 centerRectangle 列表中:

      This image shows each inner point with a red circle. Each circle is enumerated from left to right, top to bottom. The inner points are stored in the centerRectangle list:

      如果您连接每个内部点,您将获得我们一直在寻找的中心矩形:

      If you join each inner point you get the center rectangle we have been looking for:

      # Check out the big rectangle at the center:
      bigRectX = centerRectangle[0]
      bigRectY = centerRectangle[1]
      bigRectWidth = centerRectangle[2]
      bigRectHeight = centerRectangle[3]
      # Draw the big rectangle:
      cv2.rectangle(maskCopy, (int(bigRectX), int(bigRectY)), (int(bigRectX + bigRectWidth),
                           int(bigRectY + bigRectHeight)), (0, 0, 255), 5)
      cv2.imshow("Big Rectangle", maskCopy)
      cv2.waitKey(0)
      

      签出:

      现在,仅裁剪原始图像的这一部分:

      Now, just crop this portion of the original image:

      # Crop the center portion:
      centerPortion = inputCopy[bigRectY:bigRectY + bigRectHeight, bigRectX:bigRectX + bigRectWidth]
      
      # Store a deep copy for results:
      centerPortionCopy = centerPortion.copy()
      

      这是图像的中央部分:

      很酷,现在让我们创建网格.您知道,每个宽度必须有 4 个砖块,并且每个 height 必须有 4 个砖块.我们可以使用此信息来划分图像.我将每个子图像或单元格存储在一个列表中.我还要估计每个单元的中心,以进行其他处理.这些也存储在列表中.让我们看一下过程:

      Cool, now let's create the grid. You know that there must be 4 bricks per width and 4 bricks per height. We can divide the image using this info. I'm storing each sub-image, or cell, in a list. I'm also estimating each cell's center, for additional processing. These are stored in a list too. Let's see the procedure:

      # Dive the image into a grid:
      verticalCells = 4
      horizontalCells = 4
      
      # Cell dimensions
      cellWidth = bigRectWidth / verticalCells
      cellHeight = bigRectHeight / horizontalCells
      
      # Store the cells here:
      cellList = []
      
      # Store cell centers here:
      cellCenters = []
      
      # Loop thru vertical dimension:
      for j in range(verticalCells):
      
          # Cell starting y position:
          yo = j * cellHeight
      
          # Loop thru horizontal dimension:
          for i in range(horizontalCells):
      
              # Cell starting x position:
              xo = i * cellWidth
      
              # Cell Dimensions:
              cX = int(xo)
              cY = int(yo)
              cWidth = int(cellWidth)
              cHeight = int(cellHeight)
      
              # Crop current cell:
              currentCell = centerPortion[cY:cY + cHeight, cX:cX + cWidth]
      
              # into the cell list:
              cellList.append(currentCell)
      
              # Store cell center:
              cellCenters.append((cX + 0.5 * cWidth, cY + 0.5 * cHeight))
      
              # Draw Cell
              cv2.rectangle(centerPortionCopy, (cX, cY), (cX + cWidth, cY + cHeight), (255, 255, 0), 5)
      
          cv2.imshow("Grid", centerPortionCopy)
          cv2.waitKey(0)
      

      这是网格:

      现在让我们分别处理每个单元格.当然,您可以在最后一个循环上处理每个单元格,但是我目前不寻求优化,我的首要任务是清晰度.我们需要生成一系列目标颜色为 HSV 的蒙版:黄色蓝色 green (空).同样,我更喜欢用目标颜色实现 dictionary .我将为每种颜色生成一个蒙版,并使用 cv2.countNonZero 计算白色像素的数量.同样,我设置了最低阈值.这次是 10 .借助此信息,我可以确定哪个遮罩生成了最大数量的白色像素,从而为我提供了主导色:

      Let's now process each cell individually. Of course, you can process each cell on the last loop, but I'm not currently looking for optimization, clarity is my priority. We need to generate a series of HSV masks with the target colors: yellow, blue and green (empty). I prefer to, again, implement a dictionary with the target colors. I'll generate a mask for each color and I'll count the number of white pixels using cv2.countNonZero. Again, I set a minimum threshold. This time of 10. With this info I can determine which mask generated the maximum number of white pixels, thus, giving me the dominant color:

      # HSV dictionary - color ranges and color name:
      colorDictionary = {0: ([93, 64, 21], [121, 255, 255], "blue"),
                         1: ([20, 64, 21], [30, 255, 255], "yellow"),
                         2: ([55, 64, 21], [92, 255, 255], "green")}
      
      # Cell counter:
      cellCounter = 0
      
      for c in range(len(cellList)):
      
          # Get current Cell:
          currentCell = cellList[c]
          # Convert to HSV:
          hsvCell = cv2.cvtColor(currentCell, cv2.COLOR_BGR2HSV)
      
          # Some additional info:
          (h, w) = currentCell.shape[:2]
      
          # Process masks:
          maxCount = 10
          cellColor = "None"
      
          for m in range(len(colorDictionary)):
      
              # Get current lower and upper range values:
              currentLowRange = np.array(colorDictionary[m][0])
              currentUppRange = np.array(colorDictionary[m][1])
      
              # Create the HSV mask
              mask = cv2.inRange(hsvCell, currentLowRange, currentUppRange)
      
              # Get max number of target pixels
              targetPixelCount = cv2.countNonZero(mask)
              if targetPixelCount > maxCount:
                  maxCount = targetPixelCount
                  # Get color name from dictionary:
                  cellColor = colorDictionary[m][2]
      
          # Get cell center, add an x offset:
          textX = int(cellCenters[cellCounter][0]) - 100
          textY = int(cellCenters[cellCounter][1])
      
          # Draw text on cell's center:
          font = cv2.FONT_HERSHEY_SIMPLEX
          cv2.putText(centerPortion, cellColor, (textX, textY), font,
                          2, (0, 0, 255), 5, cv2.LINE_8)
      
          # Increase cellCounter:
          cellCounter += 1
      
          cv2.imshow("centerPortion", centerPortion)
          cv2.waitKey(0)
      

      这是结果:

      从这里很容易识别网格上的空白区域.我没有讲的是对失真图像的透视校正,但是有很多有关如何做到这一点的信息.希望这对您有所帮助!

      From here it is easy to identify the empty spaces on the grid. What I didn't cover was the perspective rectification of your distorted image, but there's plenty of info on how to do that. Hope this helps you out!

      如果要将这种方法应用于失真的图像,则需要消除鱼眼和透视失真.校正后的图像应如下所示:

      If you want to apply this approach to your distorted image you need to undo the fish-eye and the perspective distortion. Your rectified image should look like this:

      您可能不得不调整一些值,因为即使在校正后,某些失真仍然存在.

      You probably will have to tweak some values because some of the distortion still remains, even after rectification.

      这篇关于如何从颜色推断形状的状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆