通过Tesseract OCR在七段显示器上进行文本检测 [英] Text detection on Seven Segment Display via Tesseract OCR
问题描述
我遇到的问题是从图像中提取文本,为此,我使用了Tesseract v3.02.我必须从中提取文本的示例图像与仪表读数有关.它们中的一些具有实心的纸张背景,而另一些具有LED显示屏. 我已经训练了用于实心工作表背景的数据集,结果有些有效.
The problem that I am running with is to extract the text out of an image and for this I have used Tesseract v3.02. The sample images from which I have to extract text are related to meter readings. Some of them are with solid sheet background and some of them have LED display. I have trained the dataset for solid sheet background and the results are some how effective.
我现在遇到的主要问题是Tesseract无法识别具有LED/LCD背景的文本图像,因此不会生成训练集.
The major problem I have now is the text images with LED/LCD background which are not recognized by Tesseract and due to this the training set isn't generated.
有人可以指导我如何将Tesseract与七段式显示器(LCD/LED背景)一起使用的正确方向吗?或者我可以使用其他替代Tesseract的替代方案.
Can anyone guide me to the right direction on how to use Tesseract with the Seven Segment Display(LCD/LED background) or is there any other alternative that I can use instead of Tesseract.
推荐答案
https://github.com/upupnaway/digital-display-character-rec/blob/master/digital_display_ocr.py
使用openCV和tesseract以及经过训练的"letsgodigital"数据进行了
Did this using openCV and tesseract and the "letsgodigital" trained data
步骤包括边缘检测和使用最大轮廓提取显示内容.然后使用otsu或二值化对图像进行阈值处理,然后将其通过pytesseracts image_to_string函数传递.
-steps include edge detection and extracting the display using the largest contour. Then threshold image using otsu or binarization and pass it through pytesseracts image_to_string function.
这篇关于通过Tesseract OCR在七段显示器上进行文本检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!