我自己的OCR程序在Python [英] My own OCR-program in Python

查看：275 发布时间：2016/5/30 23:20:20 python arrays artificial-intelligence ocr

本文介绍了我自己的OCR程序在Python的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我还是个初学者，但我想写一个字符识别程序。该程序还没有准备好。而我编辑了很多，为此该意见可能不完全匹配。我将使用8连通的连接成分标签。

 从PIL进口图片
导入numpy的是NPIM = Image.open（D：\\\\ Python26 \\\\ Python的计划\\\\ bild_schrift.jpg）W，H = im.size
W = INT（W）
H = INT（H）＃二维阵区
面积= []
在范围（宽）x：
    area.append（[]）
    在范围（H）Y：
        区域[X] .append（2）#NUMBER 0是白色的，1号为黑色＃二维数组字母
信= []
在范围X（50）：
    letter.append（[]）
    y的范围内的（50）：
        函[X] .append（0）＃二维数组的标签
标签= []
在范围X（50）：
    label.append（[]）
    y的范围内的（50）：
        标签[X] .append（0）#image到数字的转换
PIX = im.load（）
阈值= 200
在范围（宽）x：
    在范围（H）Y：
        AAA = PIX [X，Y]
        BBB = AAA [0] + aaa的[1] + AAA [2]＃合计值
        如果BBB＆LT; =门槛：
            区域[X] [Y] = 1
        如果BBB＆GT;门槛：
            区域[X] [Y] = 0
np.set_printoptions（阈值='男'，线宽= 10）#matrix transponation
CCC = np.array（区）
面积= ccc.T #better解决方案？#find全黑的像素和设置临时标签号码
I = 1
在范围X（40）：＃宽（后下）
    有效范围内的Y（40）：＃heigth（后下）
        如果区域[X] [Y] == 1：
            函[X] [Y] = 1
            标签[X] [Y] =我
            I + = 1#connected成分标签
在范围X（40）：＃宽（后下）
    有效范围内的Y（40）：＃heigth（后下）
        如果区域[X] [Y] == 1：
            标签[X] [Y] =我
            ＃如果像素邻居：
            如果区域[X] [Y + 1] == 1：
                #pixel和邻居获得最低的标签
                传递＃明天的工作
            如果区[X + 1] [Y] == 1：
                #pixel和邻居获得最低的标签
                传递＃明天的工作
            #should我也比较像素和左邻？信#find宽度
信#find高度
#find信的中间
#middle = [宽度/ 2] [高度/ 2]＃？
#divide信成30份 - ＆GT; 5×6阵列#model信
#letter A-Z，A-Z，0-9（也许更多）#compare每个字母的30份与所有型号的字母
了#make加权#PRINT（信）im.save（D：\\\\ Python26 \\\\ Python的计划\\\\ bild2.jpg）
打印（'完成'）

解决方案

OCR的确不是一件容易的事。这就是为什么文本CAPTCHA系统仍然工作：）

要只信约的提取，而不是模式识别，你正在使用的字母分开谈的技术被称为的 连通区域标记 。既然你要求一个更有效的方式来做到这一点，试图实现的这篇文章中描述的两个通算法。另一种描述可以在文章斑点提取被发现。

修改：下面是我所提出的算法的实现：

 进口SYS
从PIL导入图像，ImageDraw类地区（）：
    高清__init __（自我，X，Y）：
        self._pixels = [（X，Y）]
        self._min_x = X
        self._max_x = X
        self._min_y = Y
        self._max_y = Y    高清加（个体经营，X，Y）：
        self._pixels.append（（X，Y））
        self._min_x =分钟（self._min_x，x）的
        self._max_x =最大值（self._max_x，x）的
        self._min_y =分钟（self._min_y，Y）
        self._max_y = MAX（self._max_y，Y）    高清盒（个体经营）：
        返回[（self._min_x，self._min_y），（self._max_x，self._max_y）]高清find_regions（IM）：
    宽度，高度= im.size
    区域= {}
    pixel_region = [[0 y的范围内（高度）]对于x在范围（宽度）]
    等价= {}
    n_regions = 0
    #first通。找到的区域。
    在的xrange（宽）×：
        在的xrange（高度）Y：
            #look为一个黑色象素
            如果im.getpixel（（X，Y））==（0，0，0，255）：#BLACK
                ＃从北部或西部获得该地区数
                ＃或创建新区域
                region_n = pixel_region [X-1] [Y]如果x＆GT; 0 0其他
                region_w = pixel_region [X] [Y-1]如果y＆GT; 0 0其他                max_region = MAX（region_n，region_w）                如果max_region＆GT; 0：
                    ＃A的邻居已经有一个区域
                    ＃新区域是最小的＆GT; 0
                    new_region =分钟（过滤器（拉姆达I：我大于0，（region_n，region_w）））
                    #UPDATE等价
                    如果max_region＆GT; new_region：
                        如果max_region在等价：
                            等价[max_region]。新增（new_region）
                        其他：
                            等价[max_region] =集（（new_region，））
                其他：
                    n_regions + = 1
                    new_region = n_regions                pixel_region [X] [Y] = new_region    再次#Scan形象，指派所有等效区域同一区域的价值。
    在的xrange（宽）×：
        在的xrange（高度）Y：
                R = pixel_region [X] [Y]
                如果R＆GT; 0：
                    而R IN等价：
                        R =分钟（等价[R]）                    如果没有区域R：
                        地区[R] =区域（X，Y）
                    其他：
                        地区[R]。新增（X，Y）    返回列表（regions.itervalues（））高清的main（）：
    IM = Image.open（RC：\\用户\\个人\\ PY \\ OCR \\ test.png）
    地区= find_regions（IM）
    画= ImageDraw.Draw（IM）
    在地区R：
        draw.rectangle（r.box（），大纲=（255，0，0））
    德尔平局
    ＃im.show（）
    输出=文件（output.png，WB）
    im.save（输出）
    output.close（）如果__name__ ==__main__：
    主要（）

这里是输出文件：

死链接

这不是100％完美，但因为你这样做只是为了学习目的，它可能是一个很好的起点。随着每个字符的边框，你现在可以使用神经网络作为其他人建议在这里。

I am still a beginner but I want to write a character-recognition-program. This program isn't ready yet. And I edited a lot, therefor the comments may not match exactly. I will use the 8-connectivity for the connected component labeling.

from PIL import Image
import numpy as np

im = Image.open("D:\\Python26\\PYTHON-PROGRAMME\\bild_schrift.jpg")

w,h = im.size
w = int(w)
h = int(h)

#2D-Array for area
area = []
for x in range(w):
    area.append([])
    for y in range(h):
        area[x].append(2) #number 0 is white, number 1 is black

#2D-Array for letter
letter = []
for x in range(50):
    letter.append([])
    for y in range(50):
        letter[x].append(0)

#2D-Array for label
label = []
for x in range(50):
    label.append([])
    for y in range(50):
        label[x].append(0)

#image to number conversion
pix = im.load()
threshold = 200
for x in range(w):
    for y in range(h):
        aaa = pix[x, y]
        bbb = aaa[0] + aaa[1] + aaa[2] #total value
        if bbb<=threshold:
            area[x][y] = 1
        if bbb>threshold:
            area[x][y] = 0
np.set_printoptions(threshold='nan', linewidth=10)

#matrix transponation
ccc = np.array(area) 
area = ccc.T #better solution?

#find all black pixel and set temporary label numbers
i=1
for x in range(40): # width (later)
    for y in range(40): # heigth (later)
        if area[x][y]==1:
            letter[x][y]=1
            label[x][y]=i
            i += 1

#connected components labeling
for x in range(40): # width (later)
    for y in range(40): # heigth (later)
        if area[x][y]==1:
            label[x][y]=i
            #if pixel has neighbour:
            if area[x][y+1]==1:
                #pixel and neighbour get the lowest label             
                pass # tomorrows work
            if area[x+1][y]==1:
                #pixel and neighbour get the lowest label             
                pass # tomorrows work            
            #should i also compare pixel and left neighbour?

#find width of the letter
#find height of the letter
#find the middle of the letter
#middle = [width/2][height/2] #?
#divide letter into 30 parts --> 5 x 6 array

#model letter
#letter A-Z, a-z, 0-9 (maybe more)

#compare each of the 30 parts of the letter with all model letters
#make a weighting

#print(letter)

im.save("D:\\Python26\\PYTHON-PROGRAMME\\bild2.jpg")
print('done')

解决方案

OCR is not an easy task indeed. That's why text CAPTCHAs still work :)

To talk only about the letter extraction and not the pattern recognition, the technique you are using to separate the letters is called Connected Component Labeling. Since you are asking for a more efficient way to do this, try to implement the two-pass algorithm that's described in this article. Another description can be found in the article Blob extraction.

EDIT: Here's the implementation for the algorithm that I have suggested:

import sys
from PIL import Image, ImageDraw

class Region():
    def __init__(self, x, y):
        self._pixels = [(x, y)]
        self._min_x = x
        self._max_x = x
        self._min_y = y
        self._max_y = y

    def add(self, x, y):
        self._pixels.append((x, y))
        self._min_x = min(self._min_x, x)
        self._max_x = max(self._max_x, x)
        self._min_y = min(self._min_y, y)
        self._max_y = max(self._max_y, y)

    def box(self):
        return [(self._min_x, self._min_y), (self._max_x, self._max_y)]

def find_regions(im):
    width, height  = im.size
    regions = {}
    pixel_region = [[0 for y in range(height)] for x in range(width)]
    equivalences = {}
    n_regions = 0
    #first pass. find regions.
    for x in xrange(width):
        for y in xrange(height):
            #look for a black pixel
            if im.getpixel((x, y)) == (0, 0, 0, 255): #BLACK
                # get the region number from north or west
                # or create new region
                region_n = pixel_region[x-1][y] if x > 0 else 0
                region_w = pixel_region[x][y-1] if y > 0 else 0

                max_region = max(region_n, region_w)

                if max_region > 0:
                    #a neighbour already has a region
                    #new region is the smallest > 0
                    new_region = min(filter(lambda i: i > 0, (region_n, region_w)))
                    #update equivalences
                    if max_region > new_region:
                        if max_region in equivalences:
                            equivalences[max_region].add(new_region)
                        else:
                            equivalences[max_region] = set((new_region, ))
                else:
                    n_regions += 1
                    new_region = n_regions

                pixel_region[x][y] = new_region

    #Scan image again, assigning all equivalent regions the same region value.
    for x in xrange(width):
        for y in xrange(height):
                r = pixel_region[x][y]
                if r > 0:
                    while r in equivalences:
                        r = min(equivalences[r])

                    if not r in regions:
                        regions[r] = Region(x, y)
                    else:
                        regions[r].add(x, y)

    return list(regions.itervalues())

def main():
    im = Image.open(r"c:\users\personal\py\ocr\test.png")
    regions = find_regions(im)
    draw = ImageDraw.Draw(im)
    for r in regions:
        draw.rectangle(r.box(), outline=(255, 0, 0))
    del draw 
    #im.show()
    output = file("output.png", "wb")
    im.save(output)
    output.close()

if __name__ == "__main__":
    main()

And here is the output file:

Dead link

It's not 100% perfect, but since you are doing this only for learning purposes, it may be a good starting point. With the bounding box of each character you can now use a neural network as others have suggested here.

这篇关于我自己的OCR程序在Python的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

我自己的OCR程序在Python [英] My own OCR-program in Python

问题描述

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

我自己的OCR程序在Python [英] My own OCR-program in Python

问题描述

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭