使用python-tesseract获取已识别单词的边界框 [英] Getting the bounding box of the recognized words using python-tesseract
问题描述
我正在使用python-tesseract从图像中提取单词。这是一个用于tesseract的python包装器,它是一个OCR代码。
I am using python-tesseract to extract words from an image. This is a python wrapper for tesseract which is an OCR code.
我使用以下代码来获取单词:
I am using the following code for getting the words:
import tesseract
api = tesseract.TessBaseAPI()
api.Init(".","eng",tesseract.OEM_DEFAULT)
api.SetVariable("tessedit_char_whitelist", "0123456789abcdefghijklmnopqrstuvwxyz")
api.SetPageSegMode(tesseract.PSM_AUTO)
mImgFile = "test.jpg"
mBuffer=open(mImgFile,"rb").read()
result = tesseract.ProcessPagesBuffer(mBuffer,len(mBuffer),api)
print "result(ProcessPagesBuffer)=",result
这只返回图像中的单词而不是它们的位置/大小/方向(换句话说,包含它们的边界框) 。我想知道是否有任何方法可以获得它
This returns only the words and not their location/size/orientation (or in other words a bounding box containing them) in the image. I was wondering if there is any way to get that as well
推荐答案
tesseract.GetBoxText()
方法返回数组中每个字符的确切位置。
tesseract.GetBoxText()
method returns the exact position of each character in an array.
此外,还有一个命令行选项 tesseract test。 jpg result hocr
将生成一个 result.html
文件,其中包含每个已识别的单词坐标。但我不确定它是否可以通过python脚本调用。
Besides, there is a command line option tesseract test.jpg result hocr
that will generate a result.html
file with each recognized word's coordinates in it. But I'm not sure whether it can be called through python script.
这篇关于使用python-tesseract获取已识别单词的边界框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!