如何处理和提取图像中的文本 [英] How to process and extract text from image
问题描述
我正在尝试使用python cv2从图像中提取文本.结果是可悲的,我想不出一种方法来改进我的代码. 我认为在提取文本之前需要对图像进行处理,但不确定如何处理.
I'm trying to extract text from image using python cv2. The result is pathetic and I can't figure out a way to improve my code. I believe the image needs to be processed before the extraction of text but not sure how.
我试图将其转换为黑白,但是没有运气.
I've tried to convert it into black and white but no luck.
import cv2
import os
import pytesseract
from PIL import Image
import time
pytesseract.pytesseract.tesseract_cmd='C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
cam = cv2.VideoCapture(1,cv2.CAP_DSHOW)
cam.set(cv2.CAP_PROP_FRAME_WIDTH, 8000)
cam.set(cv2.CAP_PROP_FRAME_HEIGHT, 6000)
while True:
return_value,image = cam.read()
image=cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
image = image[127:219, 508:722]
#(thresh, image) = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imwrite('test.jpg',image)
print('Text detected: {}'.format(pytesseract.image_to_string(Image.open('test.jpg'))))
time.sleep(2)
cam.release()
#os.system('del test.jpg')
推荐答案
在执行文本提取之前进行预处理以清理图像可能会有所帮助.这是一种简单的方法
Preprocessing to clean the image before performing text extraction can help. Here's a simple approach
- 将图像转换为灰度并锐化图像
- 自适应阈值
- 执行形态学操作以清洁图像
- 反转图像
首先,我们转换为灰度,然后使用锐化内核
First we convert to grayscale then sharpen the image using a sharpening kernel
接下来,我们通过自适应阈值获得二进制图像
Next we adaptive threshold to obtain a binary image
现在,我们将形态转换转换为平滑图像
Now we perform morphological transformations to smooth the image
最后我们将图像反转
import cv2
import numpy as np
image = cv2.imread('1.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
sharpen_kernel = np.array([[-1,-1,-1], [-1,9,-1], [-1,-1,-1]])
sharpen = cv2.filter2D(gray, -1, sharpen_kernel)
thresh = cv2.threshold(sharpen, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=1)
result = 255 - close
cv2.imshow('sharpen', sharpen)
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)
cv2.waitKey()
这篇关于如何处理和提取图像中的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!