如何获得与背景线重叠的文本的边框? [英] How to get the bounding box of text that are overlapped with background lines?
问题描述
例如,在下面的应用程序屏幕截图中,我要使边界框紧紧围绕在 CA-85S
(水平蓝线上的文本)和 Almaden Expy 代码>(与蓝线重叠的文本).我正在提取OCR的边界框.
For example, in the following app screenshot, I want to get the bounding box tightly rounded over CA-85S
(the text on the horizontal blue line), and Almaden Expy
(text that overlapped with the blue line). I am extracting those bounding boxes for OCR.
我已经在openCV中尝试了几种方法,但这些方法都不适合我.
I've tried several approaches in openCV that none of those approaches work for me.
推荐答案
使用以下观察结果:要提取的所需文本为黑色,并且与蓝色河流背景线条的对比度不同,一种可能的方法是使用颜色阈值,其中带有
Using the observation that the desired text to extract is in black and has a contrast different from the blue river background lines, a potential approach is to use color thresholding with cv2.inRange
. Here's the main idea and implementation using Python:
-
获得颜色阈值蒙版.加载图像,转换为HSV格式,定义上下颜色范围,然后定义颜色阈值以获得蒙版.
Obtain color thresholded mask. Load the image, convert to HSV format, define lower and upper color ranges, then color threshold to obtain a mask.
将文本合并为单个轮廓.我们使用形态学操作将单个文本字母合并为一个轮廓.
Merge text into a single contour. We create a rectangular structuring element using cv2.getStructuringElement
then use morphological operations
to merge individual text letters into a single contour.
过滤文本轮廓.我们使用 cv2.contourArea
和
Filter for text contours. We find contours with cv2.findContours
, iterate through contours, then filter using cv2.contourArea
and aspect ratio. If a contour passes this filter, we find the rotated bounding box.
隔离文本.我们可以执行此可选步骤,仅使用 cv2.bitwise_and
.
Isolate text. We can perform this optional step to extract only the text using cv2.bitwise_and
.
这是该过程的可视化:
Here's a visualization of the process:
颜色阈值蒙版
关闭字体以将文本连接到单个轮廓
Morph close to connect text into a single contour
结果
提取的单个文本
代码
import cv2
import numpy as np
# Load image, convert to HSV, color threshold to get mask
image = cv2.imread('1.png')
original = image.copy()
blank = np.zeros(image.shape[:2], dtype=np.uint8)
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
lower = np.array([0, 0, 0])
upper = np.array([179, 255, 165])
mask = cv2.inRange(hsv, lower, upper)
# Merge text into a single contour
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
close = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel, iterations=3)
# Find contours
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# Filter using contour area and aspect ratio
x,y,w,h = cv2.boundingRect(c)
area = cv2.contourArea(c)
ar = w / float(h)
if (ar > 1.4 and ar < 4) or ar < .85 and area > 100:
# Find rotated bounding box
rect = cv2.minAreaRect(c)
box = cv2.boxPoints(rect)
box = np.int0(box)
cv2.drawContours(image,[box],0,(36,255,12),2)
cv2.drawContours(blank,[box],0,(255,255,255),-1)
# Bitwise operations to isolate text
extract = cv2.bitwise_and(mask, blank)
extract = cv2.bitwise_and(original, original, mask=extract)
extract[extract==0] = 255
cv2.imshow('mask', mask)
cv2.imshow('image', image)
cv2.imshow('close', close)
cv2.imshow('extract', extract)
cv2.waitKey()
注意:使用此脚本确定HSV上下颜色阈值范围
Note: The HSV lower and upper color threshold ranges were determined using this script
import cv2
import numpy as np
def nothing(x):
pass
# Load image
image = cv2.imread('1.png')
# Create a window
cv2.namedWindow('image')
# Create trackbars for color change
# Hue is from 0-179 for Opencv
cv2.createTrackbar('HMin', 'image', 0, 179, nothing)
cv2.createTrackbar('SMin', 'image', 0, 255, nothing)
cv2.createTrackbar('VMin', 'image', 0, 255, nothing)
cv2.createTrackbar('HMax', 'image', 0, 179, nothing)
cv2.createTrackbar('SMax', 'image', 0, 255, nothing)
cv2.createTrackbar('VMax', 'image', 0, 255, nothing)
# Set default value for Max HSV trackbars
cv2.setTrackbarPos('HMax', 'image', 179)
cv2.setTrackbarPos('SMax', 'image', 255)
cv2.setTrackbarPos('VMax', 'image', 255)
# Initialize HSV min/max values
hMin = sMin = vMin = hMax = sMax = vMax = 0
phMin = psMin = pvMin = phMax = psMax = pvMax = 0
while(1):
# Get current positions of all trackbars
hMin = cv2.getTrackbarPos('HMin', 'image')
sMin = cv2.getTrackbarPos('SMin', 'image')
vMin = cv2.getTrackbarPos('VMin', 'image')
hMax = cv2.getTrackbarPos('HMax', 'image')
sMax = cv2.getTrackbarPos('SMax', 'image')
vMax = cv2.getTrackbarPos('VMax', 'image')
# Set minimum and maximum HSV values to display
lower = np.array([hMin, sMin, vMin])
upper = np.array([hMax, sMax, vMax])
# Convert to HSV format and color threshold
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv, lower, upper)
result = cv2.bitwise_and(image, image, mask=mask)
# Print if there is a change in HSV value
if((phMin != hMin) | (psMin != sMin) | (pvMin != vMin) | (phMax != hMax) | (psMax != sMax) | (pvMax != vMax) ):
print("(hMin = %d , sMin = %d, vMin = %d), (hMax = %d , sMax = %d, vMax = %d)" % (hMin , sMin , vMin, hMax, sMax , vMax))
phMin = hMin
psMin = sMin
pvMin = vMin
phMax = hMax
psMax = sMax
pvMax = vMax
# Display result image
cv2.imshow('image', result)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cv2.destroyAllWindows()
这篇关于如何获得与背景线重叠的文本的边框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!