使用OpenCV从图像中提取给定的多边形坐标 [英] Extracting polygon given coordinates from an image using OpenCV

查看:164
本文介绍了使用OpenCV从图像中提取给定的多边形坐标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下几点要点:

     <data:polygon>
                            <data:point x="542" y="107"/>
                            <data:point x="562" y="102"/>
                            <data:point x="582" y="110"/>
                            <data:point x="598" y="142"/>
                            <data:point x="600" y="192"/>
                            <data:point x="601" y="225"/>
                            <data:point x="592" y="261"/>
                            <data:point x="572" y="263"/>
                            <data:point x="551" y="245"/>
                            <data:point x="526" y="220"/>
                            <data:point x="520" y="188"/>
                            <data:point x="518" y="152"/>
                            <data:point x="525" y="127"/>
                            <data:point x="542" y="107"/
 </data:polygon>

我想在图像中绘制这些点定义的多边形,然后将其提取.如何将OpenCV与python一起使用?

I want to draw the polygon defined by these points in the image and then extract it. How can I do that using OpenCV with python ?

推荐答案

使用 cv2.fillConvexPoly ,以便您可以指定点的2D数组并定义一个遮罩,该遮罩将这些点定义的形状填充为遮罩中的白色.如果在多边形中定义的点是凸的(因此名称为fillConvexPoly),则应该做出一些合理的警告.

Use cv2.fillConvexPoly so that you can specify a 2D array of points and define a mask which fills in the shape that is defined by these points to be white in the mask. Some fair warning should be made where the points that are defined in your polygon are convex (hence the name fillConvexPoly).

然后我们可以将其转换为布尔蒙版,并使用它来索引图像,以提取所需的像素.下面的代码生成一个名为mask的数组,其中将包含您要从图像中保存的像素的布尔掩码.另外,数组out将包含由多边形定义的所需提取的子图像.请注意,图像已初始化为完全暗,并且要复制的唯一像素是多边形定义的像素.

We can then convert this to a Boolean mask and use this to index into your image to extract out the pixels you want. The code below produces an array called mask and this will contain a Boolean mask of the pixels you want to save from the image. In addition, the array out will contain the desired extracted subimage that was defined by the polygon. Take note that the image is initialized to be completely dark and that the only pixels that are to be copied over are the pixels defined by the polygon.

假定实际图像称为img,并假设您的xy点表示图像中的水平和垂直坐标,则可以执行以下操作:

Assuming the actual image is called img, and assuming that your x and y points denote the horizontal and vertical coordinates in the image, you can do something like this:

import numpy as np
import cv2

pts = np.array([[542, 107], [562, 102], [582, 110], [598, 142], [600, 192], [601, 225], [592, 261], [572, 263], [551, 245], [526, 220], [520, 188], [518, 152], [525, 127], [524, 107]], dtype=np.int32)

mask = np.zeros((img.shape[0], img.shape[1]))

cv2.fillConvexPoly(mask, pts, 1)
mask = mask.astype(np.bool)

out = np.zeros_like(img)
out[mask] = img[mask]

out应该全部为黑色,但要复制的区域除外.如果要显示此图像,可以执行以下操作:

out should all be black except for the region that is to be copied over. If you want to display this image, you can do something like:

cv2.imshow('Extracted Image', out)
cv2.waitKey(0)
cv2.destroyAllWindows()

这将显示从多边形点提取的图像,并等待您按下一个键.完成查看图像后,只要显示窗口具有焦点,就可以按任意键.

This will display the extracted image from the polygon points and wait for a key pressed by you. When you are finished looking at the image, you can push any key as long as the display window has focus.

如果您要保存此图像以归档,请执行以下操作:

If you want to save this image to file, do something like this:

cv2.imwrite('output.png', out)

这会将图像保存到名为output.png的文件中.我指定PNG格式是因为它无损.

This will save the image to a file called output.png. I specify the PNG format because it's lossless.

作为一个简单的测试,让我们定义一个300 x 700的白色图像,它远远超出了所定义的最大坐标.让我们提取由该多边形定义的区域,并显示输出结果.

As a simple test, let's define a white image that is 300 x 700, which is well beyond the largest coordinates in what you have defined. Let's extract out the region that's defined by that polygon and show what the output looks like.

img = 255*np.ones((300, 700, 3), dtype=np.uint8)

使用上面的测试图像,我们得到以下图像:

Using the above test image, we get this image:

如果您想平移提取的图像,使其位于中间,然后在边界框周围放置一个正方形,我建议的一个技巧是使用 cv2.rectangle 绘制正方形.

If you would like to translate the extracted image so that it's in the middle, and then place a square around the bounding box, a trick that I can suggest is to use cv2.remap to translate the image. Once you're done, use cv2.rectangle for drawing the square.

cv2.remap的工作方式是,对于输出中的每个像素,您需要指定要访问源图像中像素的位置的空间坐标.由于最终要将输出移到图像的中心,因此需要在目标图像中的每个xy位置添加一个偏移量,以获取源像素.

How cv2.remap works is that for each pixel in the output, you need to specify the spatial coordinate of where you want to access a pixel in the source image. Because you're ultimately moving the output to the centre of the image, you need to add an offset to every x and y location in the destination image to get the source pixel.

要找出正确的偏移量以移动图像,只需找出多边形的质心,平移多边形以使质心位于原点,然后重新转换它以使其位于图像的中心即可.

To figure out the right offsets to move the image, simply figure out the centroid of the polygon, translate the polygon so that centroid is at the origin, and then retranslate it so that it's at the centre of the image.

使用我们上面定义的变量,您可以通过以下方式找到质心:

Using the variables we defined above, you can find the centroid by:

(meanx, meany) = pts.mean(axis=0)

找到质心后,将取所有点并减去该质心,然后添加适当的坐标以重新转换到图像的中心.图像的中心可以通过以下方式找到:

Once you find the centroid, you take all points and subtract by this centroid, then add the appropriate coordinates to retranslate to the centre of the image. The centre of the image can be found by:

(cenx, ceny) = (img.shape[1]/2, img.shape[0]/2)

将像素坐标转换为整数也很重要,因为像素坐标是这样的:

It's also important that you convert the coordinates into integer as the pixel coordinates are such:

(meanx, meany, cenx, ceny) = np.floor([meanx, meany, cenx, ceny]).astype(np.int32)

现在要弄清偏移量,就像我们之前提到的那样做:

Now to figure out the offset, do this like we talked about before:

(offsetx, offsety) = (-meanx + cenx, -meany + ceny)

现在,翻译您的图像.您需要为输出图像中的每个像素定义一个映射,对于目标图像中的每个点(x,y),您需要提供从源进行采样的位置.我们计算出的偏移量会将每个源像素平移到目标位置.因为我们正在做相反,对于每个目标像素,我们要在其中找到要采样的源像素,因此必须减去偏移量,而不是相加.因此,通常首先定义(x,y)点的网格,然后减去偏移量.完成后,翻译图像:

Now, translate your image. You need to define a mapping for each pixel in the output image where for each point (x,y) in the destination image, you need to provide where to sample from the source. The offset that we calculated translates each source pixel to the destination location. Because we're doing the opposite, where for each destination pixel, we are finding which source pixel to sample from, we must subtract the offset, not add. Therefore, first define a grid of (x,y) points normally, then subtract the offset. Once you're done, translate the image:

(mx, my) = np.meshgrid(np.arange(img.shape[1]), np.arange(img.shape[0]))
ox = (mx - offsetx).astype(np.float32)
oy = (my - offsety).astype(np.float32)
out_translate = cv2.remap(out, ox, oy, cv2.INTER_LINEAR)

如果在上面的示例中显示out_translate,我们将得到:

If we displayed out_translate with the above example, this is what we get:

酷!现在是时候在该图像的顶部绘制矩形了.您所要做的就是找出矩形的左上角和右下角.这可以通过以下方式完成:拍摄多边形的左上角和右下角,并添加偏移量以将这些点移动到图像的中心:

Cool! Now it's time to draw the rectangle on top of this image. All you have to do is figure out the top left and bottom right corner of the rectangle. This can be done by taking the top left and bottom right corners of the polygon and adding the offset to move these points to the centre of the image:

topleft = pts.min(axis=0) + [offsetx, offsety]
bottomright = pts.max(axis=0) + [offsetx, offsety]
cv2.rectangle(out_translate, tuple(topleft), tuple(bottomright), color=(255,0,0))

如果显示此图像,我们将得到:

If we show this image, we get:

上面的代码在中心图像周围绘制了一个蓝色的矩形.这样,从开始(提取像素区域)到结束(翻译和绘制矩形)的完整代码是:

The above code draws a rectangle around the centered image with a blue colour. As such, the full code to go from the start (extracting the pixel region) to the end (translating and drawing a rectangle) is:

# Import relevant modules
import numpy as np
import cv2

# Define points
pts = np.array([[542, 107], [562, 102], [582, 110], [598, 142], [600, 192], [601, 225], [592, 261], [572, 263], [551, 245], [526, 220], [520, 188], [518, 152], [525, 127], [524, 107]], dtype=np.int32)

### Define image here
img = 255*np.ones((300, 700, 3), dtype=np.uint8)

# Initialize mask
mask = np.zeros((img.shape[0], img.shape[1]))

# Create mask that defines the polygon of points
cv2.fillConvexPoly(mask, pts, 1)
mask = mask.astype(np.bool)

# Create output image (untranslated)
out = np.zeros_like(img)
out[mask] = img[mask]

# Find centroid of polygon
(meanx, meany) = pts.mean(axis=0)

# Find centre of image
(cenx, ceny) = (img.shape[1]/2, img.shape[0]/2)

# Make integer coordinates for each of the above
(meanx, meany, cenx, ceny) = np.floor([meanx, meany, cenx, ceny]).astype(np.int32)

# Calculate final offset to translate source pixels to centre of image
(offsetx, offsety) = (-meanx + cenx, -meany + ceny)

# Define remapping coordinates
(mx, my) = np.meshgrid(np.arange(img.shape[1]), np.arange(img.shape[0]))
ox = (mx - offsetx).astype(np.float32)
oy = (my - offsety).astype(np.float32)

# Translate the image to centre
out_translate = cv2.remap(out, ox, oy, cv2.INTER_LINEAR)

# Determine top left and bottom right of translated image
topleft = pts.min(axis=0) + [offsetx, offsety]
bottomright = pts.max(axis=0) + [offsetx, offsety]

# Draw rectangle
cv2.rectangle(out_translate, tuple(topleft), tuple(bottomright), color=(255,0,0))

# Show image, wait for user input, then save the image
cv2.imshow('Output Image', out_translate)
cv2.waitKey(0)
cv2.destroyAllWindows()
cv2.imwrite('output.png', out_translate)

这篇关于使用OpenCV从图像中提取给定的多边形坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆