从图像中删除 OCR 字(OpenCV,Python) [英] Delete OCR word from Image (OpenCV,Python)

查看:93
本文介绍了从图像中删除 OCR 字(OpenCV,Python)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以,从我可以开始的..

So, from what I can begin..

我正在使用 OCR.该脚本非常适合我的需要.它可以准确地检测单词,这对我来说还可以.

I am working with OCR. The script works pretty well for what I need. It detects the words with an accuracy which for me is ok.

这是结果:100% 准确率附加图像.

This is the result: 100% accuracy with attached image.

from PIL import Image
import pyocr.builders
import os

os.putenv("TESSDATA_PREFIX", "C:\\Program Files (x86)\\Tesseract-OCR")

tools = pyocr.get_available_tools()
tool = tools[0]
langs = tool.get_available_languages()
lang = langs[0] #eng

file = "test.png"

txt = tool.image_to_string(Image.open(file), lang=lang, builder=pyocr.builders.TextBuilder())
print(txt + '\n')

'''
word = ['SHINE','ON','YOU','CRAZY','DIAMOND','SYD']

if word[2] in txt:
    print("## WORD IN LIST ##")
else:
    print("## NOT IN LIST ##")'''

现在的问题是:如何从图像中删除输出 OCR 列表中存在的单词(在名为 txt 的代码中)?我的意思是,如果 SHINE 这个词作为输出在控制台(和列表中)存在,我如何在图像中删除它?或者,如果不删除,请创建一个蒙版,以便我可以隐藏它...

Now the question: how can I remove from image a word which exist in the output OCR-list (in the code named txt) ? I mean, if the word SHINE exist as output in console (and in list), how can I delete it in image ? Or, if not remove, create a mask so I can hide it...

我认为 ocr 的工作原理是选择文本区域并在文本周围创建一个边界框.在这种情况下,如何删除(甚至显示)这个 ROI/边界框?在 pyocr 文档中有一些关于这个函数的提示(显示边界框),但我不知道如何使用它.

I think the ocr work by selecting areas of text and creating a bounding box around the text. In this case, how to delete (or even show) this ROI/bounding box ? In the pyocr documentation there are some hints about this function (show bounding box) but I don't know how to use it.

感谢任何帮助/提示.

谢谢

此代码显示每个字符的边界框

this code show me the bounding box for each character

import csv
import cv2
from pytesseract import pytesseract as pt

pt.run_tesseract('test.png', 'output', lang=None, boxes=True, config="hocr")

# To read the coordinates
boxes = []
with open('output.box', 'rt') as f:
    reader = csv.reader(f, delimiter = ' ')
    for row in reader:
        if len(row) == 6:
            boxes.append(row)

# Draw the bounding box
img = cv2.imread('test.png')
h, w, _ = img.shape
for b in boxes:
    img = cv2.rectangle(img,(int(b[1]),h-int(b[2])),(int(b[3]),h-int(b[4])),(255,0,0),2)

cv2.imshow('output', img)
cv2.waitKey(0)

如何让它只显示第一个(整个)单词?

How can I tell it to show me only the first (whole) word ?

推荐答案

这里有一个简单的方法

  • 将图像转换为灰度
  • 大津的门槛
  • 扩张以连接轮廓
  • 为每个词查找轮廓并提取 ROI
  • 执行 OCR 并删除单词

转换为灰度后,我们通过Otsu阈值得到二值图像

After converting to grayscale, we Otsu's threshold to obtain a binary image

接下来我们反转图像并膨胀以形成每个单词的单个轮廓

Next we invert the image and dilate to form a single contour for each word

从这里我们找到轮廓并提取每个单词的 ROI.这是检测到的投资回报率

From here we find contours and extract the ROI for each word. Here's the detected ROIs

我们将每个 ROI 投入 Pytesseract OCR.如果 OCR 结果是我们想要删除的单词,我们只需将 ROI 填充为白色并替换原始图像中的单词即可删除"该单词

We throw each ROI into Pytesseract OCR. If the OCR result is a word we want to remove, we simply "delete" the word by filling in the ROI with white and replace it in the original image

words_to_remove = ['on', 'you', 'crazy']

结果是

类似

words_to_remove = ['on', 'you', 'shine', 'diamond']

结果是

终于有了

words_to_remove = ['on', 'you', 'crazy', 'diamond']

import cv2
import pytesseract

words_to_remove = ['on', 'you', 'crazy', 'diamond']
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

image = cv2.imread("1.png")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
inverted_thresh = 255 - thresh
dilate = cv2.dilate(inverted_thresh, kernel, iterations=4)

cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    ROI = thresh[y:y+h, x:x+w]
    data = pytesseract.image_to_string(ROI, lang='eng',config='--psm 6').lower()
    if data in words_to_remove:
        image[y:y+h, x:x+w] = [255,255,255]

cv2.imshow("thresh", thresh)
cv2.imshow("dilate", dilate)
cv2.imshow("image", image)
cv2.waitKey(0)

这篇关于从图像中删除 OCR 字(OpenCV,Python)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆