使用Tesseract OCR和TESS-Two时的垃圾结果 [英] Junk results when using Tesseract OCR and tess-two
问题描述
我已经使用Tesseract OCR库开发了OCR应用程序,并通过以下链接进行了引用.
I have developed OCR Application using Tesseract OCR Library and referred from the following Links.
- android-ocr
- tesseract
但是有时我会得到垃圾数据作为结果.谁能帮助我做进一步的事情以获得准确的结果.
But I am getting junk data as results sometimes. Can anyone help me what to do further to get accurate results.
推荐答案
如果您想获得针对您的案例以及所使用的任何代码的特定帮助,但应该提供测试图像,但要获得准确的一般经验即可结果是:
You should provide your test images if you want to get specific help for your case as well as any code you are using but a general rule of thumb for getting accurate results are :
-
使用高分辨率图像(如果需要),最低300 DPI
Use a high resolution image (if needed) 300 DPI is minimum
确保图像中没有阴影或弯曲
Make sure there is no shadows or bends in the image
如果存在任何歪斜,则需要在ocr之前的代码中修复图像
If there is any skew, you will need to fix the image in code prior to ocr
使用词典来帮助获得良好的结果
Use a dictionary to help get good results
调整文本大小(理想的字体是12 pt)
Adjust the text size (12 pt font is ideal)
对图像进行二值化处理,并使用图像处理算法去除噪声
Binarize the image and use image processing algorithms to remove noise
最重要的是,还有许多图像处理功能可根据您的图像帮助提高准确性,例如偏斜校正,透视校正,线条去除,边框去除,点去除,去斑点等,以及更多取决于图像的功能.在您的图像上.
On top of all this, there are a lot of image processing functions out there that can help increase accuracy depending on your image such as deskew, perspective correction, line removal, border removal, dot removal, despeckle, and many more depending on your image.
这篇关于使用Tesseract OCR和TESS-Two时的垃圾结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!