如何在OpenCV中构建二进制图像的水平投影 [英] How to construct horizontal projection of binary image in OpenCV

查看:73
本文介绍了如何在OpenCV中构建二进制图像的水平投影的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为学校做一个文本分割项目.我需要对二进制图像进行水平图像投影.我想要的结果是这样的:

I am doing a text segmentation project for school. I need to do horizontal image projection of a binary image. The results that I want are like this:

的示例.

我正在Python中使用OpenCV.如问题所建议,我使用x_sum = cv2.reduce(img, 0, cv2.REDUCE_SUM, dtype=cv2.CV_32S)来获取总和的数组:图像的水平和垂直投影以及以下问题: OpenCV中的水平直方图.

I am using OpenCV in Python. I used x_sum = cv2.reduce(img, 0, cv2.REDUCE_SUM, dtype=cv2.CV_32S) to get the array of sums, as advised by this question: horizontal and vertical projection of an image and this question: Horizontal Histogram in OpenCV.

我试图使用cv2.calcHist来获取水平投影图像,但是我得到的只是一条水平线.我的代码如下:

I tried to get the horizontal projection image by using cv2.calcHist, but what I got was just a single horizontal line. My code is below:

image = cv2.imread(file_name)
x_sum = cv2.reduce(image, 0, cv2.REDUCE_SUM, dtype=cv2.CV_32S)
horizontal_projection=cv2.calcHist(x_sum,[0],None,[256],[0,256])
cv2.imwrite("image2.png", horizontal_projection) 

请帮助并告诉我我做错了什么.我需要我的水平投影结果像Quora示例一样.

Please help and tell me what I am doing wrong. I need my horizontal projection results to be just like the Quora example.

推荐答案

在计算投影时,您基本上要对图像每一行的像素求和.但是,您的文本是黑色的,其编码为零,因此您将在一行中有很多文本的地方得到小数字,而在一行中有很少文本的地方得到大数字-这与您想要的相反-因此您需要反转:

When calculating the projection, you basically want to sum the pixels along each row of the image. However, your text is black, which is encoded as zero so you will get small numbers where there is a lot of text in a row and large numbers where there is little text in a row - which is the opposite of what you want - so you need to invert:

import cv2
import numpy as np

# Load as greyscale
im = cv2.imread('text.png', cv2.IMREAD_GRAYSCALE)

# Invert
im = 255 - im

# Calculate horizontal projection
proj = np.sum(im,1)

数组proj现在的高度为141行,每行对应于图像在该行中的文本数量:

The array proj is now 141 rows tall, each corresponding to how much text is in that row of the image:

array([    0,     0,     0,     0,    40,    44,   144,   182,   264,
         326,   425,  1193,  2718,  5396,  9272, 11880, 13266, 13597,
       12906, 11962, 10791,  9647,  8554, 20469, 45426, 65714, 81397,
       81675, 66590, 58714, 58046, 60516, 66136, 71794, 77552, 78555,
       74868, 72083, 70139, 70160, 72174, 76409, 82854, 88962, 94721,
       88105, 69126, 47753, 23966, 13845, 17406, 19145, 19079, 16548,
       11524,  8511,  7465,  7042,  7197,  6577,  5022,  3476,  1797,
         809,   450,   309,   348,   351,   250,   232,   271,   279,
         251,   628,  1419,  3259,  6187,  8272,  9551,  9825,  9119,
        7984,  6444,  5305,  4596, 13385, 31647, 46330, 57459, 56139,
       42402, 34928, 33729, 35055, 38874, 41649, 43394, 43265, 41291,
       40126, 39767, 40515, 42390, 44478, 46793, 47881, 47743, 43983,
       36644, 28054, 18242, 15583, 20047, 22038, 21569, 17751, 10571,
        6830,  6580,  6231,  5681,  4595,  2879,  1642,   771,   365,
         320,   282,   105,    88,    76,    76,    28,    28,    28,
          28,     0,     0,     0,     0,     0], dtype=uint64)

我将您的图片裁剪为819x141像素,如下所示:

I cropped your image to 819x141 pixels as follows:

有很多方法可以进行可视化.这是一个:

There are many ways to do the visualisation. Here is one:

#!/usr/bin/env python3

import cv2
import numpy as np

# Load as greyscale
im = cv2.imread('text.png', cv2.IMREAD_GRAYSCALE)

# Invert
im = 255 - im

# Calculate horizontal projection
proj = np.sum(im,1)

# Create output image same height as text, 500 px wide
m = np.max(proj)
w = 500
result = np.zeros((proj.shape[0],500))

# Draw a line for each row
for row in range(im.shape[0]):
   cv2.line(result, (0,row), (int(proj[row]*w/m),row), (255,255,255), 1)

# Save result
cv2.imwrite('result.png', result)

这篇关于如何在OpenCV中构建二进制图像的水平投影的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆