贝叶斯类的数字识别 [英] Digit Recognition with Bayesian classes

查看:147
本文介绍了贝叶斯类的数字识别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只需要为数字写一个OCR程序.我将使用MNIST数据集.问题是我不知道从哪里开始.有很多论文并没有真正解释该算法.我对模式识别的了解并不多.所以我有几个问题.

I need to write an OCR program for digits only. I will use MNIST datasets. The problem is I do not know where to start. There are a lot of papers which doesn't really explain the algorithm. I don't really have much knowledge about pattern recognition. So I have a few questions.

Q1:在哪里可以找到算法(或教程) 问题2:如何对数字进行分类?我不需要非常高级的东西.我想到的第一件事是找到上半部分/下半部分与左侧/右侧的比率.有没有更有用和更容易的分类方法. Q3:什么是反向传播以及大多数论文中显示的层.我需要简单的OCR吗?

Q1 : Where can I find the algorithm (or a tutorial) Q2 : How do I classify digits? I don't need very advanced things. First thing that comes to my mind is finding the ratio of upper half/lower half and left side/ right side. Is there more useful and easy classification methods. Q3 : What is back propagation and the layers which is shown in most of the papers. Do I need them for my simple OCR.

注意:我知道我的OCR程序不正确.现在不是很重要.

Note: I know my OCR program won't be accurate. It isn't very important for now.

推荐答案

如果距离您最近的工程库中有关于图像处理,计算机视觉或机器视觉的部分,那么幸运的是,该库将拥有一本书的副本.我建议使用OCR:

If the closest engineering library to you has a section on image processing, computer vision, or machine vision, then with luck that library will have a copy of a book I recommend for OCR:

字符识别系统

这本书对OCR技术和最新研究进行了相当全面的概述.它并没有深入探讨任何特定主题,但确实为学术论文提供了参考.

This book provides a fairly comprehensive overview of OCR techniques and recent research. It does not go into great depth on any particular subject, but it does provide references to academic papers.

确保您可以使用有关图像处理的入门书籍.冈萨雷斯和伍兹的这本书是许多大学的标准书:

Make sure you have access to a good introductory textbook on image processing. The book by Gonzalez and Woods is a standard in many universities:

数字图像处理

即使是简单的" OCR,也会很快变得棘手.如果您不了解基本的图像处理原理,而跳入有关神经网络,贝叶斯定理等的课程,那可能会令人不知所措.

Even "simple" OCR gets tricky very quickly. It could be overwhelming if you jump into a class about neural networks, Bayes theorem, etc., before you have a firm grasp of basic image processing principles.

如果可以的话,请在尝试编写用于手写字符的算法之前,尝试为机器打印的字符编写一个或多个OCR算法.

If you can, try writing one or more OCR algorithms for machine-printed characters before you attempt to write an algorithm for handwritten characters.

Q1:在哪里可以找到算法(或教程)

有许多用于OCR的算法. Cheriet书将为您提供一个良好的开端.

There are numerous algorithms for OCR. The Cheriet book will give you a good start.

问题2:如何对数字进行分类?我不需要非常高级的东西.我想到的第一件事是找到上半部分/下半部分与左侧/右侧的比率.有没有更有用,更容易的分类方法.

尝试实施该技术,并查看其效果如何.即使实施效果不如您期望的那样,实施过程中获得的经验教训也可以在以后为您提供帮助.

Try implementing that technique and see how well it works. Even if the implementation doesn't work as well as you'd like, lessons learned while implementing it could help you later.

您还可以将字符细分为2 x 2网格或3 x 3网格,并检查像素的相对密度.与机器打印的字符不同,手写字符在直线网格中不能很好地对齐.

You can also subdivide a character into a 2 x 2 grid or 3 x 3 grid and check for relatively densities of pixels. Unlike machine printed characters, handwritten characters won't line up nicely in rectilinear grids.

使用归一化相关性进行模板匹配非常简单,并且对于单个已知字体的机器打印字符而言,它可以很好地工作.它实现起来相对简单,值得学习: http://en.wikipedia.org/wiki/Cross-correlation#Normalized_cross-correlation

Template matching using normalized correlation is simple, and it can work reasonably well for machine printed characters for a single, known font. It's relatively simple to implement and worth learning: http://en.wikipedia.org/wiki/Cross-correlation#Normalized_cross-correlation

对于OCR,通常将第一步中的字符稀疏.细化是一种将字符(或任何其他形状)缩小为1像素宽的表示形式的技术.字符变细后,可以更轻松地识别直线和交点.如果您可以识别直线(或曲线)和交点,那么一种方法是查看每条线相对于另一条线的相对位置和角度.

For OCR it's common to thin the characters in your sample as an initial step. Thinning is a technique to reduce a character (or any other shape) to a representation that is 1 pixel wide. Once you have a thinned character it can be easier to identify lines and intersections. If you can identify lines (or curves) and intesections, then one technique is to look at the relative position and angle of each line with respect to the others.

常见的稀疏算法包括Stentiford和Zhang-Suen. WinTopo有一个免费版本,可以演示这两种算法: http://wintopo.com/

Common thinning algorithms include Stentiford and Zhang-Suen. There's a freeware version of WinTopo that demonstrates both of these algorithms: http://wintopo.com/

您可以查看有关中风提取"的学术论文,但这些技术往往更难实施.

You can look into academic papers about "stroke extraction", but those techniques tend to be more difficult to implement.

Q3:什么是反向传播以及大多数论文中显示的层.我需要简单的OCR吗?

这些术语指的是人工神经网络.对于简单的OCR算法,您将对识别逻辑进行硬编码或使用简单的训练方法.可以训练人工神经网络来识别未在软件中进行硬编码的字符. http://en.wikipedia.org/wiki/Neural_network

These terms refer to artificial neural networks. For a simple OCR algorithm you'll hard-code the recognition logic OR use simple training methods. Artificial neural networks can be trained to recognize characters that aren't hard-coded in your software. http://en.wikipedia.org/wiki/Neural_network

尽管您不需要学习人工神经网络就可以编写简单的OCR算法,但是简单的算法只能在手写字符上获得有限的成功.

Although you don't need to learn about artificial neural network to write a simple OCR algorithm, a simple algorithm will have only limited success with handwritten characters.

首先,请记住,手写字符的OCR是一个非常困难的问题.如果您可以通过一种简单的方法将手写字符的读取率提高到20%,则可以认为成功.

Above all, keep in mind that OCR for handwritten characters is an extremely difficult problem. If you could achieve a handwritten character read rate of 20% with a simple technique, then consider that a success.

这篇关于贝叶斯类的数字识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆