如何从复杂的验证码中提取数字 [英] How to extract numbers from a complex captcha

查看：94 发布时间：2021/4/21 19:24:33 python tesseract captcha python-tesseract

本文介绍了如何从复杂的验证码中提取数字的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试为以下图片解析验证码

！

输出:

  print(ctext)'436359 OS'

我建议您不要将完整页面的URL作为pytesseract的输入.而是将确切的图片网址指定为"

I am trying to resolve captcha for the following image

!https://ibb.co/35X723J

I have tried using tessaract

data = br.open(captchaurl).read()
b = bytearray(data)
save = open(filename, 'wb')
save.write(data)
save.close()
ctext= pytesseract.image_to_string(Image.open(filename))

解决方案

Option 1:

I think using Pytesseract should solve the issue. I tried out your code and it gave me the following result when i gave in the exact cropped captcha image as input into pytesseract:

Input Image:

Output:

print(ctext)
 '436359 oS'

I suggest you don't give the full page url as input into pytesseract. Instead give the exact image url as "https://i.ibb.co/RGn9fF5/Jpeg-Image-CS2.jpg" which will take in only the image.

And regarding the extra 'oS' characters in the output, you can do a string manipulation to chop off the characters other than numbers in the output.

re.sub("[^0-9]", "", ctext)

Option 2:

You can also use google's OCR to accomplish this which gives you the exact result without errors. Though I have shown you the web interface of it, google has nice python libraries through which you can accomplish this using python itself. Looks like this:

这篇关于如何从复杂的验证码中提取数字的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何从复杂的验证码中提取数字 [英] How to extract numbers from a complex captcha

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何从复杂的验证码中提取数字 [英] How to extract numbers from a complex captcha

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭