使用PYTHON PIL从Captcha图像中删除背景嘈杂的线条 [英] Removing background noisy lines from Captcha Image using PYTHON PIL

查看：861 发布时间：2020/5/27 20:55:58 python algorithm image-processing python-imaging-library captcha

本文介绍了使用PYTHON PIL从Captcha图像中删除背景嘈杂的线条的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个经过处理的验证码图片(已放大)，如下所示:

I have a processed captcha image(Enlarged) look like :

如您所见，"TEXT"的字体大小比噪点线"的宽度大一点.
因此，我需要一种算法或代码来删除该图像中的噪点.

As you can see, the font-size of the "TEXT" is bit larger than the width of the Noisy Lines.
So I need an algorithm or code to remove the noisy lines from this image.

借助Python PIL库和下面提到的斩波算法，我无法获得OCR可以轻松读取的输出图像.

With the help of Python PIL Library and the chopping algorithm mentioned below I din't get the output image which could be easily read by OCRs.

这是我尝试过的Python代码:

Here's Python code that I tried :

import PIL.Image
import sys

# python chop.py [chop-factor] [in-file] [out-file]

chop = int(sys.argv[1])
image = PIL.Image.open(sys.argv[2]).convert('1')
width, height = image.size
data = image.load()

# Iterate through the rows.
for y in range(height):
    for x in range(width):

        # Make sure we're on a dark pixel.
        if data[x, y] > 128:
            continue

        # Keep a total of non-white contiguous pixels.
        total = 0

        # Check a sequence ranging from x to image.width.
        for c in range(x, width):

            # If the pixel is dark, add it to the total.
            if data[c, y] < 128:
                total += 1

            # If the pixel is light, stop the sequence.
            else:
                break

        # If the total is less than the chop, replace everything with white.
        if total <= chop:
            for c in range(total):
                data[x + c, y] = 255

        # Skip this sequence we just altered.
        x += total


# Iterate through the columns.
for x in range(width):
    for y in range(height):

        # Make sure we're on a dark pixel.
        if data[x, y] > 128:
            continue

        # Keep a total of non-white contiguous pixels.
        total = 0

        # Check a sequence ranging from y to image.height.
        for c in range(y, height):
            # If the pixel is dark, add it to the total.
            if data[x, c] < 128:
                total += 1

            # If the pixel is light, stop the sequence.
            else:
                break

        # If the total is less than the chop, replace everything with white.
        if total <= chop:
            for c in range(total):
                data[x, y + c] = 255

        # Skip this sequence we just altered.
        y += total

image.save(sys.argv[3])

因此，基本上，我想知道一种更好的算法/代码来消除噪声，从而能够使图像由OCR(Tesseract或pytesser)可读.

So, basically I would like to know a better algorithm/code to get rid of the noise and thus able to make the image readable by the OCR (Tesseract or pytesser).

使用PYTHON PIL从Captcha图像中删除背景嘈杂的线条 [英] Removing background noisy lines from Captcha Image using PYTHON PIL

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用PYTHON PIL从Captcha图像中删除背景嘈杂的线条 [英] Removing background noisy lines from Captcha Image using PYTHON PIL

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭