图片preprocessing措施,提高识别率 [英] Image Preprocessing steps to improve the recognition rate
问题描述
我想提出一个简单的OCR Android应用使用TessBaseAPI我的项目。我做了一些像preprocessing步骤,像二值化和图象增强。但其结果是50%至60%。我们怎样才能提高识别率?
I am making a simple OCR Android App using TessBaseAPI for my project. I have done some image preprocessing steps like binarization and image inhancement. But their result is 50% to 60%. How can we improve the recognition rate?
我包括两个样本图像。
http://imageshack.us/photo/my-images/94 /1school.jpg/
http://imageshack.us/photo/my-images/43 /15071917.jpg/
推荐答案
以下补充上述命令适用于你的第二个形象:
The following additions to above command works for your second image:
-negate \
-deskew 40% \
+repage \
-crop 393x110+0+0 \
他们补充适当的倾斜校正和种植到结果的水平,从而使的tesseract的生活变得更容易一点......
They add appropriate levels of deskewing and cropping to the result, so that Tesseract's life gets a bit easier...
所以,完整的命令应该是下面的,它产生于我的系统的正确的结果:
So the complete command should be the following, which produces the correct result on my system:
convert 15071917.jpg \
-type grayscale \
-negate \
-gamma 1 \
-contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast -contrast \
-normalize -normalize -normalize -normalize -normalize -normalize -normalize -normalize -normalize -normalize \
-despeckle -despeckle -despeckle -despeckle -despeckle -despeckle -despeckle -despeckle -despeckle -despeckle \
-negate \
-deskew 40% \
+repage \
-crop 393x110+0+0 \
15071917.png \
&& \
tesseract 15071917.png OUT && cat OUT.txt
Tesseract Open Source OCR Engine v3.01 with Leptonica
Page 0
TESCO
这是原始图像(左)所导致的修改命令(右)的图片:
 
This is the original picture (left) with the resulting picture of the modified command (right):
这篇关于图片preprocessing措施,提高识别率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!