使用opencv获取图像中存在的所有文本的位置 [英] Get the location of all text present in image using opencv
问题描述
我有这张图像,其中包含文本(数字和字母).我想获取此图像中存在的所有文本和数字的位置.我也想提取所有文本.
I have this image that contains text(numbers and alphabets) in it. I want to get the location of all the text and numbers present in this image. Also I want to extract all the text as well.
如何获取坐标以及图像中的所有文本(数字和字母).例如10B,44、16、38、22B等
How do I get the cordinates as well as the all the text(numbers and alphabets) in my image. For eg 10B, 44, 16, 38, 22B etc
推荐答案
Here's a potential approach using morphological operations to filter out non-text contours. The idea is:
-
获取二进制图像.,先加载灰度图像,然后 Otsu的阈值
Obtain binary image. Load image, grayscale, then Otsu's threshold
删除水平和垂直线.使用 cv2.drawContours
删除对角线,圆形对象和弯曲轮廓.使用轮廓区域 cv2.approxPolyDP
隔离非文本轮廓
Remove diagonal lines, circle objects, and curved contours. Filter using contour area cv2.contourArea
and contour approximation cv2.approxPolyDP
to isolate non-text contours
提取文本ROI和OCR.查找轮廓并过滤ROI,然后使用OCR Pytesseract .
Extract text ROIs and OCR. Find contours and filter for ROIs then OCR using Pytesseract.
删除了以绿色突出显示的水平线
Removed horizontal lines highlighted in green
删除垂直线
删除了各种非文本轮廓(对角线,圆形对象和曲线)
Removed assorted non-text contours (diagonal lines, circular objects, and curves)
检测到的文本区域
import cv2
import numpy as np
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
clean = thresh.copy()
# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15,1))
detect_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detect_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(clean, [c], -1, 0, 3)
# Remove vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,30))
detect_vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(detect_vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(clean, [c], -1, 0, 3)
cnts = cv2.findContours(clean, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# Remove diagonal lines
area = cv2.contourArea(c)
if area < 100:
cv2.drawContours(clean, [c], -1, 0, 3)
# Remove circle objects
elif area > 1000:
cv2.drawContours(clean, [c], -1, 0, -1)
# Remove curve stuff
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.02 * peri, True)
x,y,w,h = cv2.boundingRect(c)
if len(approx) == 4:
cv2.rectangle(clean, (x, y), (x + w, y + h), 0, -1)
open_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
opening = cv2.morphologyEx(clean, cv2.MORPH_OPEN, open_kernel, iterations=2)
close_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,2))
close = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, close_kernel, iterations=4)
cnts = cv2.findContours(close, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
area = cv2.contourArea(c)
if area > 500:
ROI = image[y:y+h, x:x+w]
ROI = cv2.GaussianBlur(ROI, (3,3), 0)
data = pytesseract.image_to_string(ROI, lang='eng',config='--psm 6')
if data.isalnum():
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
print(data)
cv2.imwrite('image.png', image)
cv2.imwrite('clean.png', clean)
cv2.imwrite('close.png', close)
cv2.imwrite('opening.png', opening)
cv2.waitKey()
这篇关于使用opencv获取图像中存在的所有文本的位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!