“添加" Tesseract eng.traineddata的新字体 [英] "Adding" new fonts to Tesseract eng.traineddata
问题描述
据我所知,Tesseract 3.x带有6种英语(如果我输入错了,请纠正我)字体.我需要训练Tesseract以获取更多5种类型的字体.我只需要大写字母和数字(不需要特殊字符或符号).
As far as I know, Tesseract 3.x comes with 6 English (correct me if I'm wrong) fonts. I need to train Tesseract for more 5 types of fonts. I need only capital letters and digits (no special characters or symbols).
例如,我遵循各种过程: 为Tesseract 3 OCR引擎添加新字体
I followed various processes for example: Adding New Fonts to Tesseract 3 OCR Engine
,还使用了一些工具来自动化流程,例如 用于Tesseract 3.02的Serak Tesseract培训师
and also used tools to automate the process like Serak Tesseract Trainer for Tesseract 3.02
为了生成框文件,我使用了 QT框编辑器
For generating box files I used QT Box Editor
使用上述工具后,得到eng.traineddata
文件.所有教程都告诉我将此eng.traineddata
文件添加到Tesseract-OCR\tessdata
文件夹中,但是这样做将替换原始的eng.traineddata
文件.完成此操作后,我会丢失Tesseract 3.x随附的默认字体吗?
After using above tools I get eng.traineddata
file. All tutorials tell me to add this eng.traineddata
file to the Tesseract-OCR\tessdata
folder, but doing so, it will replace the original eng.traineddata
file. After doing this will I lose the default fonts that come with Tesseract 3.x ?
如何添加新字体?我仍然不清楚.我希望有人可以在这里帮助我.谢谢.
How can I Add new fonts? Its still not clear to me. I hope someone can help me here. Thanks.
推荐答案
应使用其他名称,例如eng1.traineddata
.这样,您可以通过指定语言选项-l eng+eng1
将新数据与原始数据一起使用.
Should use a different name, e.g., eng1.traineddata
. That way you can use the new data with the original one by specifying the language option -l eng+eng1
.
这篇关于“添加" Tesseract eng.traineddata的新字体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!