如何在python中使用OCR从Image重新获得文本的坐标 [英] How to get the co-ordinates of the text recogonized from Image using OCR in python
问题描述
我正在尝试使用Tesseract从图像获取文本字符的坐标或位置. 我想知道确切的像素位置,以便可以使用其他工具单击该文本.
I am trying to get the coordinates or positions of text character from an Image using Tesseract. I want to know the exact pixel position, so that i can click that text using some other tool.
import pytesseract
from pytesseract import pytesseract
import PIL
from PIL import Image
import cv2
import csv
img = 'E:\\OCR-DATA\\sample.jpg'
imge = Image.open(img)
data=pytesseract.image_to_string(imge,lang='eng',boxes=True,config='hocr')
print(data)
data
包含具有框边界值的可识别文本.但是我不确定如何使用该边界值获取文本的坐标.
data
contains recognized text with box boundary value. But i am not sure , how to use that boundary value to get the co-ordinates of the text.
data
变量的值如下:
O 100 356 115 373 0
u 117 356 127 368 0
t 130 356 138 372 0
p 141 351 152 368 0
u 154 356 164 368 0
t 167 356 175 371 0
推荐答案
每行都有边界框的坐标.
You have the coordinates of the bounding box in every line.
字符,左,下,右,上,页
character, left, bottom, right, top, page
因此,对于每个字符,您都会获得该字符,然后是其边界框字符,然后是从0开始的页码.
So for each character you get the character, followed by its bounding box characters, followed by the 0-based page number.
这篇关于如何在python中使用OCR从Image重新获得文本的坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!