我自己的OCR程序在Python [英] My own OCR-program in Python
问题描述
我还是个初学者,但我想写一个字符识别程序。该程序还没有准备好。而我编辑了很多,为此该意见可能不完全匹配。我将使用8连通的连接成分标签。
从PIL进口图片
导入numpy的是NPIM = Image.open(D:\\\\ Python26 \\\\ Python的计划\\\\ bild_schrift.jpg)W,H = im.size
W = INT(W)
H = INT(H)#二维阵区
面积= []
在范围(宽)x:
area.append([])
在范围(H)Y:
区域[X] .append(2)#NUMBER 0是白色的,1号为黑色#二维数组字母
信= []
在范围X(50):
letter.append([])
y的范围内的(50):
函[X] .append(0)#二维数组的标签
标签= []
在范围X(50):
label.append([])
y的范围内的(50):
标签[X] .append(0)#image到数字的转换
PIX = im.load()
阈值= 200
在范围(宽)x:
在范围(H)Y:
AAA = PIX [X,Y]
BBB = AAA [0] + aaa的[1] + AAA [2]#合计值
如果BBB< =门槛:
区域[X] [Y] = 1
如果BBB>门槛:
区域[X] [Y] = 0
np.set_printoptions(阈值='男',线宽= 10)#matrix transponation
CCC = np.array(区)
面积= ccc.T #better解决方案?#find全黑的像素和设置临时标签号码
I = 1
在范围X(40):#宽(后下)
有效范围内的Y(40):#heigth(后下)
如果区域[X] [Y] == 1:
函[X] [Y] = 1
标签[X] [Y] =我
I + = 1#connected成分标签
在范围X(40):#宽(后下)
有效范围内的Y(40):#heigth(后下)
如果区域[X] [Y] == 1:
标签[X] [Y] =我
#如果像素邻居:
如果区域[X] [Y + 1] == 1:
#pixel和邻居获得最低的标签
传递#明天的工作
如果区[X + 1] [Y] == 1:
#pixel和邻居获得最低的标签
传递#明天的工作
#should我也比较像素和左邻?信#find宽度
信#find高度
#find信的中间
#middle = [宽度/ 2] [高度/ 2]#?
#divide信成30份 - > 5×6阵列#model信
#letter A-Z,A-Z,0-9(也许更多)#compare每个字母的30份与所有型号的字母
了#make加权#PRINT(信)im.save(D:\\\\ Python26 \\\\ Python的计划\\\\ bild2.jpg)
打印('完成')
OCR的确不是一件容易的事。这就是为什么文本CAPTCHA系统仍然工作:)
要只信约的提取,而不是模式识别,你正在使用的字母分开谈的技术被称为的 连通区域标记 。既然你要求一个更有效的方式来做到这一点,试图实现的这篇文章中描述的两个通算法。另一种描述可以在文章斑点提取被发现。
修改:下面是我所提出的算法的实现:
进口SYS
从PIL导入图像,ImageDraw类地区():
高清__init __(自我,X,Y):
self._pixels = [(X,Y)]
self._min_x = X
self._max_x = X
self._min_y = Y
self._max_y = Y 高清加(个体经营,X,Y):
self._pixels.append((X,Y))
self._min_x =分钟(self._min_x,x)的
self._max_x =最大值(self._max_x,x)的
self._min_y =分钟(self._min_y,Y)
self._max_y = MAX(self._max_y,Y) 高清盒(个体经营):
返回[(self._min_x,self._min_y),(self._max_x,self._max_y)]高清find_regions(IM):
宽度,高度= im.size
区域= {}
pixel_region = [[0 y的范围内(高度)]对于x在范围(宽度)]
等价= {}
n_regions = 0
#first通。找到的区域。
在的xrange(宽)×:
在的xrange(高度)Y:
#look为一个黑色象素
如果im.getpixel((X,Y))==(0,0,0,255):#BLACK
#从北部或西部获得该地区数
#或创建新区域
region_n = pixel_region [X-1] [Y]如果x> 0 0其他
region_w = pixel_region [X] [Y-1]如果y> 0 0其他 max_region = MAX(region_n,region_w) 如果max_region> 0:
#A的邻居已经有一个区域
#新区域是最小的> 0
new_region =分钟(过滤器(拉姆达I:我大于0,(region_n,region_w)))
#UPDATE等价
如果max_region> new_region:
如果max_region在等价:
等价[max_region]。新增(new_region)
其他:
等价[max_region] =集((new_region,))
其他:
n_regions + = 1
new_region = n_regions pixel_region [X] [Y] = new_region 再次#Scan形象,指派所有等效区域同一区域的价值。
在的xrange(宽)×:
在的xrange(高度)Y:
R = pixel_region [X] [Y]
如果R> 0:
而R IN等价:
R =分钟(等价[R]) 如果没有区域R:
地区[R] =区域(X,Y)
其他:
地区[R]。新增(X,Y) 返回列表(regions.itervalues())高清的main():
IM = Image.open(RC:\\用户\\个人\\ PY \\ OCR \\ test.png)
地区= find_regions(IM)
画= ImageDraw.Draw(IM)
在地区R:
draw.rectangle(r.box(),大纲=(255,0,0))
德尔平局
#im.show()
输出=文件(output.png,WB)
im.save(输出)
output.close()如果__name__ ==__main__:
主要()
这里是输出文件:
块引用>这不是100%完美,但因为你这样做只是为了学习目的,它可能是一个很好的起点。随着每个字符的边框,你现在可以使用神经网络作为其他人建议在这里。
I am still a beginner but I want to write a character-recognition-program. This program isn't ready yet. And I edited a lot, therefor the comments may not match exactly. I will use the 8-connectivity for the connected component labeling.
from PIL import Image import numpy as np im = Image.open("D:\\Python26\\PYTHON-PROGRAMME\\bild_schrift.jpg") w,h = im.size w = int(w) h = int(h) #2D-Array for area area = [] for x in range(w): area.append([]) for y in range(h): area[x].append(2) #number 0 is white, number 1 is black #2D-Array for letter letter = [] for x in range(50): letter.append([]) for y in range(50): letter[x].append(0) #2D-Array for label label = [] for x in range(50): label.append([]) for y in range(50): label[x].append(0) #image to number conversion pix = im.load() threshold = 200 for x in range(w): for y in range(h): aaa = pix[x, y] bbb = aaa[0] + aaa[1] + aaa[2] #total value if bbb<=threshold: area[x][y] = 1 if bbb>threshold: area[x][y] = 0 np.set_printoptions(threshold='nan', linewidth=10) #matrix transponation ccc = np.array(area) area = ccc.T #better solution? #find all black pixel and set temporary label numbers i=1 for x in range(40): # width (later) for y in range(40): # heigth (later) if area[x][y]==1: letter[x][y]=1 label[x][y]=i i += 1 #connected components labeling for x in range(40): # width (later) for y in range(40): # heigth (later) if area[x][y]==1: label[x][y]=i #if pixel has neighbour: if area[x][y+1]==1: #pixel and neighbour get the lowest label pass # tomorrows work if area[x+1][y]==1: #pixel and neighbour get the lowest label pass # tomorrows work #should i also compare pixel and left neighbour? #find width of the letter #find height of the letter #find the middle of the letter #middle = [width/2][height/2] #? #divide letter into 30 parts --> 5 x 6 array #model letter #letter A-Z, a-z, 0-9 (maybe more) #compare each of the 30 parts of the letter with all model letters #make a weighting #print(letter) im.save("D:\\Python26\\PYTHON-PROGRAMME\\bild2.jpg") print('done')
解决方案OCR is not an easy task indeed. That's why text CAPTCHAs still work :)
To talk only about the letter extraction and not the pattern recognition, the technique you are using to separate the letters is called Connected Component Labeling. Since you are asking for a more efficient way to do this, try to implement the two-pass algorithm that's described in this article. Another description can be found in the article Blob extraction.
EDIT: Here's the implementation for the algorithm that I have suggested:
import sys from PIL import Image, ImageDraw class Region(): def __init__(self, x, y): self._pixels = [(x, y)] self._min_x = x self._max_x = x self._min_y = y self._max_y = y def add(self, x, y): self._pixels.append((x, y)) self._min_x = min(self._min_x, x) self._max_x = max(self._max_x, x) self._min_y = min(self._min_y, y) self._max_y = max(self._max_y, y) def box(self): return [(self._min_x, self._min_y), (self._max_x, self._max_y)] def find_regions(im): width, height = im.size regions = {} pixel_region = [[0 for y in range(height)] for x in range(width)] equivalences = {} n_regions = 0 #first pass. find regions. for x in xrange(width): for y in xrange(height): #look for a black pixel if im.getpixel((x, y)) == (0, 0, 0, 255): #BLACK # get the region number from north or west # or create new region region_n = pixel_region[x-1][y] if x > 0 else 0 region_w = pixel_region[x][y-1] if y > 0 else 0 max_region = max(region_n, region_w) if max_region > 0: #a neighbour already has a region #new region is the smallest > 0 new_region = min(filter(lambda i: i > 0, (region_n, region_w))) #update equivalences if max_region > new_region: if max_region in equivalences: equivalences[max_region].add(new_region) else: equivalences[max_region] = set((new_region, )) else: n_regions += 1 new_region = n_regions pixel_region[x][y] = new_region #Scan image again, assigning all equivalent regions the same region value. for x in xrange(width): for y in xrange(height): r = pixel_region[x][y] if r > 0: while r in equivalences: r = min(equivalences[r]) if not r in regions: regions[r] = Region(x, y) else: regions[r].add(x, y) return list(regions.itervalues()) def main(): im = Image.open(r"c:\users\personal\py\ocr\test.png") regions = find_regions(im) draw = ImageDraw.Draw(im) for r in regions: draw.rectangle(r.box(), outline=(255, 0, 0)) del draw #im.show() output = file("output.png", "wb") im.save(output) output.close() if __name__ == "__main__": main()
And here is the output file:
It's not 100% perfect, but since you are doing this only for learning purposes, it may be a good starting point. With the bounding box of each character you can now use a neural network as others have suggested here.
这篇关于我自己的OCR程序在Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!