增加图像中文本行之间的空间 [英] Increase space between text lines in image

查看:55
本文介绍了增加图像中文本行之间的空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一行文字的输入图像,以单行间距显示.我正在尝试实现行间距选项之类的功能,以增加/减少Microsoft Word中文本行之间的空间.当前图像在单个空格中,如何将文本转换为两个空格?还是说 .5 空格?本质上,我正在尝试动态调整文本行之间的间距,最好使用可调参数.像这样:

I have an input image of a paragraph of text in single line spacing. I'm trying to implement something like the line spacing option to increase/decrease space between text lines in Microsoft Word. The current image is in single space, how can I convert the text into double space? Or say .5 space? Essentially I'm trying to dynamically restructure the spacing between text lines, preferably with an adjustable parameter. Something like this:

输入图片

所需结果

我当前的尝试是这样的.我已经能够略微增加间距,但文字细节似乎受到侵蚀,并且行之间存在随机噪声.

My current attempt looks like this. I've been able to increase the spacing slightly but the text detail seems to be eroded and there is random noise in between lines.

关于如何改进代码或任何更好方法的任何想法?

Any ideas on how to improve the code or any better approaches?

import numpy as np 
import cv2

img = cv2.imread('text.png')
H, W = img.shape[:2]
grey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshed = cv2.threshold(grey, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

hist = cv2.reduce(threshed, 1, cv2.REDUCE_AVG).reshape(-1)
spacing = 2
delimeter = [y for y in range(H - 1) if hist[y] <= spacing < hist[y + 1]]
arr = []
y_prev, y_curr = 0, 0
for y in delimeter:
    y_prev = y_curr
    y_curr = y
    arr.append(threshed[y_prev:y_curr, 0:W])

arr.append(threshed[y_curr:H, 0:W])
space_array = np.zeros((10, W))
result = np.zeros((1, W))

for im in arr:
    v = np.concatenate((space_array, im), axis=0)
    result = np.concatenate((result, v), axis=0)

result = (255 - result).astype(np.uint8)
cv2.imshow('result', result)
cv2.waitKey()

推荐答案

方法1:像素分析

  1. 获取二进制图像.加载图像,转换为灰度,然后输入Otsu的阈值

  1. Obtain binary image. Load the image, convert to grayscale, and Otsu's threshold

行像素总和.的想法是,行的像素总和可用于确定它是否对应于文本或空白

Sum row pixels. The idea is that the pixel sum of a row can be used to determine if it corresponds to text or white space

创建新图像并添加其他空白.我们遍历像素阵列并添加其他空白

Create new image and add additional white space. We iterate through the pixel array and add additional white space


二进制图片


Binary image

# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
h, w = image.shape[:2]
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

现在,我们遍历每一行并对白色像素求和以生成像素阵列.我们可以分析从每一行中所有像素的总和生成的一列数据,以确定哪些行对应于文本.等于 0 的数据部分表示由空白组成的图像行.这是数据数组的可视化结果:

Now we iterate through each row and sum the white pixels to generate a pixel array. We can profile a column of data generated from the sum of all the pixels in each row to determine which rows correspond to text. Sections of the data that equal 0 represents rows of the image that are composed of white space. Here's a visualization of the data array:

# Sum white pixels in each row
# Create blank space array and and final image 
pixels = np.sum(thresh, axis=1).tolist()
space = np.ones((2, w), dtype=np.uint8) * 255
result = np.zeros((1, w), dtype=np.uint8)

我们将数据转换为列表,然后遍历数据以构建最终图像.如果确定一行是空白,则我们将一个空白数组连接到最终图像.通过调整空数组的大小,我们可以更改要添加到图像的空间量.

We convert the data to a list and iterate through the data to build the final image. If a row is determined to be white space then we concatenate an empty space array to the final image. By adjusting the size of the empty array, we can change the amount of space to add to the image.

# Iterate through each row and add space if entire row is empty
# otherwise add original section of image to final image
for index, value in enumerate(pixels):
    if value == 0:
        result = np.concatenate((result, space), axis=0)
    row = gray[index:index+1, 0:w]
    result = np.concatenate((result, row), axis=0)

这是结果

代码

import cv2
import numpy as np 
import matplotlib.pyplot as plt
# import pandas as pd

# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
h, w = image.shape[:2]
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]

# Sum white pixels in each row
# Create blank space array and and final image 
pixels = np.sum(thresh, axis=1).tolist()
space = np.ones((1, w), dtype=np.uint8) * 255
result = np.zeros((0, w), dtype=np.uint8)

# Iterate through each row and add space if entire row is empty
# otherwise add original section of image to final image
for index, value in enumerate(pixels):
    if value == 0:
        result = np.concatenate((result, space), axis=0)
    row = gray[index:index+1, 0:w]
    result = np.concatenate((result, row), axis=0)

# Uncomment for plot visualization
'''
x = range(len(pixels))[::-1]
df = pd.DataFrame({'y': x, 'x': pixels})
df.plot(x='x', y='y', xlim=(-2000,max(pixels) + 2000), legend=None, color='teal')
'''
cv2.imshow('result', result)
cv2.imshow('thresh', thresh)
plt.show()
cv2.waitKey()

方法2:单行提取

对于更动态的方法,我们可以找到每条线的轮廓,然后在每个轮廓之间添加空间.我们使用与第一种方法相同的方法来添加额外的空白.

Approach #2: Individual line extraction

For a more dynamic approach, we can find the contours of each line and then add space in between each contour. We use the same method of appending extra white space as the 1st approach.

  1. 获取二进制图像.加载图像,灰度,高斯模糊和Otsu的阈值

  1. Obtain binary image. Load image, grayscale, Gaussian blur, and Otsu's threshold

连接文本轮廓.我们创建一个水平形状的内核,并进行扩张以将每行单词连接到一个轮廓中

Connect text contours. We create a horizontal shaped kernel and dilate to connect the words of each line into a single contour

提取每条线的轮廓.我们找到轮廓,使用 imtuils.contours.sort_contours()从上到下进行排序,并提取每条线的ROI

Extract each line contour. We find contours, sort from top-to-bottom using imtuils.contours.sort_contours() and extract each line ROI

在每行之间添加空白.我们创建一个空数组,并通过在每条线轮廓之间添加空白来构建新图像

Append white space in between each line. We create a empty array and build the new image by appending white space between each line contour


二进制图片


Binary image

# Load image, grayscale, blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
invert = 255 - thresh  
height, width = image.shape[:2]

创建水平核并扩张

# Dilate with a horizontal kernel to connect text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (10,2))
dilate = cv2.dilate(thresh, kernel, iterations=2)

提取的单线轮廓以绿色突出显示

Extracted individual line contour highlighted in green

# Extract each line contour
lines = []
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts, _) = contours.sort_contours(cnts, method="top-to-bottom")
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (0, y), (width, y+h), (36,255,12), 2)
    line = original[y:y+h, 0:width]
    line = cv2.cvtColor(line, cv2.COLOR_BGR2GRAY)
    lines.append(line)

在每行之间添加空格.这是一个 1 像素宽空间数组

Append white space in between each line. Here's the result with a 1 pixel wide space array

具有 5 像素宽空间数组的结果

Result with a 5 pixel wide space array

# Append white space in between each line
space = np.ones((1, width), dtype=np.uint8) * 255
result = np.zeros((0, width), dtype=np.uint8)
result = np.concatenate((result, space), axis=0)
for line in lines:
    result = np.concatenate((result, line), axis=0)
    result = np.concatenate((result, space), axis=0)

完整代码

import cv2
import numpy as np 
from imutils import contours

# Load image, grayscale, blur, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
invert = 255 - thresh  
height, width = image.shape[:2]

# Dilate with a horizontal kernel to connect text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (10,2))
dilate = cv2.dilate(thresh, kernel, iterations=2)

# Extract each line contour
lines = []
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
(cnts, _) = contours.sort_contours(cnts, method="top-to-bottom")
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (0, y), (width, y+h), (36,255,12), 2)
    line = original[y:y+h, 0:width]
    line = cv2.cvtColor(line, cv2.COLOR_BGR2GRAY)
    lines.append(line)

# Append white space in between each line
space = np.ones((1, width), dtype=np.uint8) * 255
result = np.zeros((0, width), dtype=np.uint8)
result = np.concatenate((result, space), axis=0)
for line in lines:
    result = np.concatenate((result, line), axis=0)
    result = np.concatenate((result, space), axis=0)

cv2.imshow('result', result)
cv2.imshow('image', image)
cv2.imshow('dilate', dilate)
cv2.waitKey()

这篇关于增加图像中文本行之间的空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆