Tesseract认为我的1是7 [英] Tesseract thinks my 1's are 7's

查看：89 发布时间：2020/5/19 19:32:20 ocr tesseract

本文介绍了Tesseract认为我的1是7的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这似乎是ocr的常见问题.有没有办法告诉tesseract我的1实际上是1?

It seems like this is probably a common issue with ocr. Is there a way to tell tesseract that my 1's are actually 1's?

希望不会在此过程中将我的7变成1.

Hopefully without changing my 7's into 1's in the process.

注意:这些是扫描的文档，我不知道使用什么字体.

Note: these are scanned documents and I have no idea what font was used.

推荐答案

如果"tesseract"是可训练的，请尝试手动对字体进行训练.它应该可以解决问题.

if "tesseract" is trainable, try to train it on the font manually. It should solve the problem.

还有另一种可能的解决方案.在"tesseracting"之后制作一个小型的检定模块.对于所有1s和7s，请使用基于强度的方法仔细检查它们.例如，尝试在其上找到拐角(特征点)，然后将KLT与1和7模板一起应用，看看哪个获得了更好的跟踪结果.这种方法比较昂贵，但是由于您只能在2个模板上尝试使用，而且体积很小，因此我认为它不会降低性能.

There is another possible solution. Make a small valdiation module after "tesseracting". For all 1s and 7s, double check them using intensity based method. For example try to find corners(feature points) on it and apply KLT with 1 and 7 template and see which one got more positive tracking result. This method is costy but since you will try it on just 2 templates and so small, I do not think it gonna be a big performance decreasing.

如果两种解决方案都不可行，请尝试使用后处理解决.例如，如果该年龄段是学生年龄，则不会是78岁，而是18岁，依此类推.但是，这种方法太糟糕了，根本无法解决.但是当不可能有其他解决方案时，您必须做类似的事情.

if both solution are not possible , try to solve it using post-processing. For example, if it is a student age it would not be 78, it is 18 and so on. However this method is so bad and not a solution at all. but when no other solution is possible you have to do something like it.

这篇关于Tesseract认为我的1是7的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Tesseract认为我的1是7 [英] Tesseract thinks my 1's are 7's

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Tesseract认为我的1是7 [英] Tesseract thinks my 1&#39;s are 7&#39;s

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

Tesseract认为我的1是7 [英] Tesseract thinks my 1's are 7's

登录关闭