识别屏幕截图中字符的最佳方法? [英] Best way to recognize characters in screenshot?

查看:173
本文介绍了识别屏幕截图中字符的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从屏幕截图中识别所有字符时,您会建议什么?屏幕截图非常清晰(白色背景上只有黑色文本),我也可以为文本选择任何标准字体(安装在Windows上).我尝试了一些OCR方式(Tesseract等),但是在识别某些字符时却犯了错误(这使我感到困惑,因为文本没有丝毫干扰,字体是一些最常见的字体-Courier New,Fixedsys等),我需要它是100%准确的.是否有一些用于此特定目的的库,某些模式识别之类的东西?还是应该使用某些等宽字体来获取屏幕截图,然后迭代图像移动到右侧的+ font_size像素,然后将捕获的内容与内存中的字母和相同大小的相同字体的数字进行比较?解决此问题的最佳方法是什么?提前非常感谢您.

What would you recommend for recognizing all characters from a screenshot? The screenshot is perfectly clear (only black text on a white background), also I can choose any standard font for the text (installed on Windows). I have tried some OCR ways (Tesseract and such), but it made mistakes in recognizing some characters (that baffled me, as the text is without slightest noise, and the fonts were some most common ones - Courier New, Fixedsys etc.), and I need it to be 100% accurate. Is there some library available for this specific purpose, some pattern recognition or something? or should I get the screenshot with some monospaced font, and iterate through the image moving to the right +font_size pixels and then comparing captured thing to in-memory representation of letters and number of same font in the same size? What would be the best approach to this problem? Thank you very much in advance.

更新:我终于通过用我截屏的确切大小的等宽字体(Courier New)训练Tesseract来获得100%的准确性.希望以后能对某人有所帮助:)

UPDATE: I've finally managed to get 100% accuracy by training Tesseract with monospaced font (Courier New) in exact size that I'm screenshotting. Hope that helps someone in the future :)

推荐答案

由于这是Google在tesseract recognize screenshot上的第一个结果,因此让我做一点巫法学并添加一个简单得多的解决方案.

Since this is the first result on Google for tesseract recognize screenshot, let me do bit of necromancy and add a much simpler solution.

Tesseract希望图像大约300 dpi或更高和标准dpi Windows的版本是96.这意味着您需要将图像重新缩放到300%.之后,结果将大大改善.

Tesseract expects images at around 300 dpi or more and standard dpi for Windows is 96. Which means you need to rescale the image to 300%. After that, the results improve dramatically.

100%

结果:Whal would you recommend for recognizing all characters from a screensnor 7

100%

Result: Whal would you recommend for recognizing all characters from a screensnor 7

200%

结果:What would you recommend for recognizing all chamcters from a screenth ?

200%

Result: What would you recommend for recognizing all chamcters from a screenth ?

300%

结果:What would you recommend for recognizing all characters from a screenshot ?

300%

Result: What would you recommend for recognizing all characters from a screenshot ?

任何高于300%的效果都一样.

Anything above 300% works just as well.

这篇关于识别屏幕截图中字符的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆