如何获得与背景线重叠的文本的边框? [英] How to get the bounding box of text that are overlapped with background lines?

查看:56
本文介绍了如何获得与背景线重叠的文本的边框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,在下面的应用程序屏幕截图中,我要使边界框紧紧围绕在 CA-85S (水平蓝线上的文本)和 Almaden Expy (与蓝线重叠的文本).我正在提取OCR的边界框.

For example, in the following app screenshot, I want to get the bounding box tightly rounded over CA-85S (the text on the horizontal blue line), and Almaden Expy (text that overlapped with the blue line). I am extracting those bounding boxes for OCR.

我已经在openCV中尝试了几种方法,但这些方法都不适合我.

I've tried several approaches in openCV that none of those approaches work for me.

推荐答案

使用以下观察结果:要提取的所需文本为黑色,并且与蓝色河流背景线条的对比度不同,一种可能的方法是使用颜色阈值,其中带有

Using the observation that the desired text to extract is in black and has a contrast different from the blue river background lines, a potential approach is to use color thresholding with cv2.inRange. Here's the main idea and implementation using Python:

  1. 获得颜色阈值蒙版.加载图像,转换为HSV格式,定义上下颜色范围,然后定义颜色阈值以获得蒙版.

  1. Obtain color thresholded mask. Load the image, convert to HSV format, define lower and upper color ranges, then color threshold to obtain a mask.

将文本合并为单个轮廓.我们使用

Merge text into a single contour. We create a rectangular structuring element using cv2.getStructuringElement then use morphological operations to merge individual text letters into a single contour.

过滤文本轮廓.我们使用

Filter for text contours. We find contours with cv2.findContours, iterate through contours, then filter using cv2.contourArea and aspect ratio. If a contour passes this filter, we find the rotated bounding box.

隔离文本.我们可以执行此可选步骤,仅使用 cv2.bitwise_and .

Isolate text. We can perform this optional step to extract only the text using cv2.bitwise_and.


这是该过程的可视化:


Here's a visualization of the process:

颜色阈值蒙版

关闭字体以将文本连接到单个轮廓

Morph close to connect text into a single contour

结果

提取的单个文本

代码

import cv2
import numpy as np

# Load image, convert to HSV, color threshold to get mask
image = cv2.imread('1.png')
original = image.copy()
blank = np.zeros(image.shape[:2], dtype=np.uint8)
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
lower = np.array([0, 0, 0])
upper = np.array([179, 255, 165])
mask = cv2.inRange(hsv, lower, upper)

# Merge text into a single contour
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
close = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel, iterations=3)

# Find contours
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    # Filter using contour area and aspect ratio
    x,y,w,h = cv2.boundingRect(c)
    area = cv2.contourArea(c)
    ar = w / float(h)
    if (ar > 1.4 and ar < 4) or ar < .85 and area > 100:
        # Find rotated bounding box
        rect = cv2.minAreaRect(c)
        box = cv2.boxPoints(rect)
        box = np.int0(box)
        cv2.drawContours(image,[box],0,(36,255,12),2)
        cv2.drawContours(blank,[box],0,(255,255,255),-1)

# Bitwise operations to isolate text
extract = cv2.bitwise_and(mask, blank)
extract = cv2.bitwise_and(original, original, mask=extract)
extract[extract==0] = 255

cv2.imshow('mask', mask)
cv2.imshow('image', image)
cv2.imshow('close', close)
cv2.imshow('extract', extract)
cv2.waitKey()

注意:使用此脚本确定HSV上下颜色阈值范围

Note: The HSV lower and upper color threshold ranges were determined using this script

import cv2
import numpy as np

def nothing(x):
    pass

# Load image
image = cv2.imread('1.png')

# Create a window
cv2.namedWindow('image')

# Create trackbars for color change
# Hue is from 0-179 for Opencv
cv2.createTrackbar('HMin', 'image', 0, 179, nothing)
cv2.createTrackbar('SMin', 'image', 0, 255, nothing)
cv2.createTrackbar('VMin', 'image', 0, 255, nothing)
cv2.createTrackbar('HMax', 'image', 0, 179, nothing)
cv2.createTrackbar('SMax', 'image', 0, 255, nothing)
cv2.createTrackbar('VMax', 'image', 0, 255, nothing)

# Set default value for Max HSV trackbars
cv2.setTrackbarPos('HMax', 'image', 179)
cv2.setTrackbarPos('SMax', 'image', 255)
cv2.setTrackbarPos('VMax', 'image', 255)

# Initialize HSV min/max values
hMin = sMin = vMin = hMax = sMax = vMax = 0
phMin = psMin = pvMin = phMax = psMax = pvMax = 0

while(1):
    # Get current positions of all trackbars
    hMin = cv2.getTrackbarPos('HMin', 'image')
    sMin = cv2.getTrackbarPos('SMin', 'image')
    vMin = cv2.getTrackbarPos('VMin', 'image')
    hMax = cv2.getTrackbarPos('HMax', 'image')
    sMax = cv2.getTrackbarPos('SMax', 'image')
    vMax = cv2.getTrackbarPos('VMax', 'image')

    # Set minimum and maximum HSV values to display
    lower = np.array([hMin, sMin, vMin])
    upper = np.array([hMax, sMax, vMax])

    # Convert to HSV format and color threshold
    hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
    mask = cv2.inRange(hsv, lower, upper)
    result = cv2.bitwise_and(image, image, mask=mask)

    # Print if there is a change in HSV value
    if((phMin != hMin) | (psMin != sMin) | (pvMin != vMin) | (phMax != hMax) | (psMax != sMax) | (pvMax != vMax) ):
        print("(hMin = %d , sMin = %d, vMin = %d), (hMax = %d , sMax = %d, vMax = %d)" % (hMin , sMin , vMin, hMax, sMax , vMax))
        phMin = hMin
        psMin = sMin
        pvMin = vMin
        phMax = hMax
        psMax = sMax
        pvMax = vMax

    # Display result image
    cv2.imshow('image', result)
    if cv2.waitKey(10) & 0xFF == ord('q'):
        break

cv2.destroyAllWindows()

这篇关于如何获得与背景线重叠的文本的边框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆