python验证码解码器库 [英] python captcha decoder library
问题描述
我需要一个用于python的Captcha解码器来读取如下图所示的简单图像验证码:
I need a Captcha decoder for python to read simple image captchas like the following picture:
你知道一个可以帮我读这个验证码的图书馆吗?
Do you know of a library that can help me read this captcha?
如果你不知道用于阅读验证码的图书馆,你能帮我用PIL阅读这个(以及其他类似的东西)吗?
If you don't know of a library for reading captchas, could you help me to read this (and others like this) with PIL?
推荐答案
我希望这个验证码不会在任何地方使用。
I hope this captcha is not used anywhere.
以下是一种虚拟解码方式它。基本上你需要的是这些验证码中存在的从0到9的模式。从你的例子中,我只有0 3 4 5 7 8的模式。因为一切都固定在它们上面,你知道在哪里分割每个字符。您还知道每个字符都是一些固定大小和固定字体。如果它还包括字母或更多字符,但具有固定大小和字体,则可以轻松调整以下代码。
Following is a dummy way to decode it. Basically what you need are the patterns from 0 to 9 as present in these captchas. From your examples, I have only the patterns for 0 3 4 5 7 8. Since everything is fixed on them, you know where to split each character. You also know each character is a number of fixed size and fixed font. If it also includes letters or more characters, but of fixed size and font, then the following code can be easily adapted.
代码的作用是:a)加载模式(我认为它们被命名为n0.png,n1.png,...); b)将验证码分成NUMS部分; c)在每个模式和每个分割数之间做一个平方差的和; d)确定分割数是具有最小总和的分割数。它按顺序返回每个数字的列表,显示在验证码中。要获取初始模式,您可以取消注释保存拆分号码的行,在该段后放置 return
,然后调整文件名。
What the code does is: a) load the patterns (I considered they are named n0.png, n1.png, ...); b) split the captcha in NUMS pieces; c) do a sum of squared differences between each pattern and each split number; d) decide that the the split number is the one with the smallest sum. It returns a list for each number, in order, present in the captcha. To obtain the initial patterns, you can uncomment the lines that save the split numbers, place a return
after that piece, and adjust the file names.
import sys
from PIL import Image, ImageOps
PAT_SIZE = (8, 10)
NUMS = 3
FIRST_NUM_OFFSET = 5
NUM_OFFSET = (1, 3)
NUMBERS = []
for i in xrange(10):
try:
NUMBERS.append(Image.open('n%d.png' % i).load())
except IOError:
print "I do not know the pattern for the number %d." % i
NUMBERS.append(None)
def magic(fname):
captcha = ImageOps.grayscale(Image.open(fname))
im = captcha.load()
# Split numbers
num = []
for n in xrange(NUMS):
x1, y1 = (FIRST_NUM_OFFSET + n * (NUM_OFFSET[0] + PAT_SIZE[0]),
NUM_OFFSET[1])
num.append(captcha.crop((x1, y1, x1 + PAT_SIZE[0], y1 + PAT_SIZE[1])))
# If you want to save the split numbers:
#for i, n in enumerate(num):
# n.save('%d.png' % i)
def sqdiff(a, b):
if None in (a, b): # XXX This is here just to handle missing pattern.
return float('inf')
d = 0
for x in xrange(PAT_SIZE[0]):
for y in xrange(PAT_SIZE[1]):
d += (a[x, y] - b[x, y]) ** 2
return d
# Calculate a dummy sum of squared differences between the patterns
# and each number. We assume the smallest diff is the number in the
# "captcha".
result = []
for n in num:
n_sqdiff = [(sqdiff(p, n.load()), i) for i, p in enumerate(NUMBERS)]
result.append(min(n_sqdiff)[1])
return result
print magic(sys.argv[1])
这篇关于python验证码解码器库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!