当图像是表格时，是否可以更改图像背景颜色的一部分? [英] Is it possible to change a part of the background color of an image, when the image is a table?

查看：108 发布时间：2020/5/19 19:33:43 python opencv ocr opencv3.1 python-tesseract

本文介绍了当图像是表格时，是否可以更改图像背景颜色的一部分?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用pytesseract，枕头，cv2对图像进行OCR，并在图像中显示文本.由于我的输入是扫描的PDF文档，因此我首先将其转换为图像(JPEG)格式，然后尝试提取文本.我只有一半.输入为表格，标题未显示，因为标题背景为黑色.我也尝试过getstructuringelement，但无法找出一种方法，这就是我所做的-

I am using pytesseract, pillow,cv2 to OCR an image and get the text present in the image. Since my input is a scanned PDF document, I first converted it into an image (JPEG) format and then tried extracting the text. I am only half way there. The input is a table and the titles are not being displayed, since the titles have a black background. I also tried getstructuringelement but unable to figure out a way Here is what I did-

import cv2
import os  
import numpy as np 
import pytesseract
#import pillow 

#Since scanned PDF can't be handled by pdf2image, convert the scanned PDF into a JPEG format using the below code- 
filename = path   
from pdf2image import convert_from_path 
pages = convert_from_path(filename, 500) for page in pages:
page.save("dest", 'JPEG')


imgname = "path" 
oriimg = cv2.imread(imgname,cv2.IMREAD_COLOR) 
cv2.imshow("original image", oriimg)
cv2.waitKey(0)


#img = cv2.resize(oriimg,None,fx=0.5,fy=0.5,interpolation=cv2.INTER_CUBIC) 
img = cv2.resize(oriimg,(700,1500),interpolation=cv2.INTER_AREA) 
#here length height  
cv2.imshow("lol", img) 
cv2.waitKey(0) 
cv2.imwrite("changed_dimensionsimgpath", img)


import PIL.Image  
image = cv2.imread(imgname,cv2.IMREAD_COLOR) 
grayedimg = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) grayedimg = 
cv2.threshold(grayedimg, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1] 
cv2.imwrite("H://newim.jpg", grayedimg)


pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract- 
OCR\tesseract.exe"


text = pytesseract.image_to_string(PIL.Image.open("path"))
print(text)

我的输入表如下所示.具有黑色背景的区域不会被OCR识别，也不会被提取为文本. -

My input table looks like below. The regions which have black background are not being identified by OCR and not being extracted as text. --

当图像是表格时，是否可以更改图像背景颜色的一部分? [英] Is it possible to change a part of the background color of an image, when the image is a table?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

当图像是表格时，是否可以更改图像背景颜色的一部分? [英] Is it possible to change a part of the background color of an image, when the image is a table?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭