Python:检测文本块并将其从图像中删除(OpenCV) [英] Python: Detecting textblock and deleting it from image (OpenCV)

查看:67
本文介绍了Python:检测文本块并将其从图像中删除(OpenCV)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试找出如何检测图像上的文本段落以将其删除.

I'm currently trying to figure out how to detect a text paragraph on an image in order to remove it.

我得到一个输入图像,它类似于上面给出的图像.从那里开始,我要检测评论的正文/评论的消息.点赞,用户名和头像是不需要的,应将其忽略.然后应从评论中删除该正文,但其余部分应保留.

I get an input image, which is similar to the image given above. From there on I want to detect the body of the comment/the message of the comment. Likes, Username and Avatar are not needed and should be ignored. The body should then be removed from the comment, but the rest should stay.

到目前为止,我添加了一个阈值并找到了轮廓.问题在于注释主体不是作为一个零件而是作为各种轮廓被检测到的.如何合并它们?此外,我想在找到轮廓后将其从图像中删除.背景颜色为RGB(17,17,17),是否可以在其上绘画,或者在OpenCv中它如何工作?我很新.

I added a threshold so far and found the contours. The problem is that the comment body does not get detected as one part, but rather as various contours. How do I combine them? Furthermore, I then want to remove it from the image as soon as I found its contour. The background color is RGB(17, 17, 17), is there a way of painting over it or how does it work in OpenCv? I'm quite new to it.

img = cv2.imread("Comment.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, threshold = cv2.threshold(gray, 80, 255, cv2.THRESH_BINARY)
contours, _ = cv2.findContours(threshold, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

结果应该像这样

感谢您的帮助,在此先感谢您!

Help is appreciated, thanks in advance!

推荐答案

这个想法非常简单.使用形态来隔离要检测的文本.使用此图像,创建遮罩,以删除输入图像中的关注区域并生成最终图像.全部通过形态.我的答案是在 C ++ 中,但是实现起来确实很简单:

The idea is really simple. Use morphology to isolate the text you want to detect. Using this image, create a mask to delete the region of interest in the input image and produce a final image. All via morphology. My answer is in C++, but the implementation is really easy:

//Read input image:
std::string imagePath = "C://opencvImages//commentImage.png";
cv::Mat imageInput= cv::imread( imagePath );

//Convert it to grayscale:
cv::Mat grayImg;
cv::cvtColor( imageInput, grayImg, cv::COLOR_BGR2GRAY );

//Get binary image via Otsu:
cv::threshold( grayImg, grayImg, 0, 255 , cv::THRESH_OTSU );

到目前为止,您已经生成了二进制映像.现在,让我们使用比宽高的矩形结构元素( SE )对图像进行扩张.我的想法是我想将所有文本水平地 AND 垂直地连接(仅一点点).如果您看到输入图像,则"TEST132212" 文本与注释之间只有一点点间隔,看起来足以承受 dilate 操作.让我们看看,在这里,我正在使用大小为 9 x 6 SE ,并进行了 2 次迭代:

Up until this point, you have generated the binary image. Now, let's dilate the image using a rectangular structuring element (SE) wider than taller. The idea is that I want to join all the text horizontally AND vertically (just a little bit). If you see the input image, the "TEST132212" text is just a little bit separated from the comment, enough to survive the dilate operation, it seems. Let's see, here, I'm using a SE of size 9 x 6 with 2 iterations:

cv::Mat morphKernel = cv::getStructuringElement( cv::MORPH_RECT, cv::Size(9, 6) );
int morphIterations = 2;
cv::morphologyEx( grayImg, grayImg, cv::MORPH_DILATE, morphKernel, cv::Point(-1,-1), morphIterations );

这是结果:

我有一个独特的街区,原始评论是-不错!现在,这是图像中的最大斑点.如果将其减去原始二进制图像,则应生成一个 mask ,它将成功隔离所有不是注释"斑点的东西:

I got a unique block where the original comment was - Nice! Now, this is the largest blob in the image. If I subtract it to the original binary image, I should generate a mask that will successfully isolate everything that is not the "comment" blob:

cv::Mat bigBlob = findBiggestBlob( grayImg );

我明白了:

现在,二进制掩码代:

cv::Mat binaryMask = grayImg - bigBlob;

//Use the binaryMask to produce the final image:
cv::Mat resultImg;
imageInput.copyTo( resultImg, binaryMask );

产生被遮盖的图像:

现在,您应该已经注意到 findBiggestBlob 函数.这是我制作的函数,该函数返回二进制映像中最大的blob.这个想法只是计算输入图像中的所有轮廓,计算它们的面积,并存储束中面积最大的轮廓.这是 C ++ 的实现:

Now, you should have noted the findBiggestBlob function. This is a function I've made that returns the biggest blob in a binary image. The idea is just to compute all the contours in the input image, calculate their area and store the contour with the largest area of the bunch. This is the C++ implementation:

//Function to get the largest blob in a binary image:
cv::Mat findBiggestBlob( cv::Mat &inputImage ){

    cv::Mat biggestBlob = inputImage.clone();

    int largest_area = 0;
    int largest_contour_index=0;

    std::vector< std::vector<cv::Point> > contours; // Vector for storing contour
    std::vector<cv::Vec4i> hierarchy;

    // Find the contours in the image
    cv::findContours( biggestBlob, contours, hierarchy,CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE ); 

    for( int i = 0; i< (int)contours.size(); i++ ) {            

        //Find the area of the contour            
        double a = cv::contourArea( contours[i],false);
        //Store the index of largest contour:
        if( a > largest_area ){
            largest_area = a;                
            largest_contour_index = i;
        }

    }

    //Once you get the biggest blob, paint it black:
    cv::Mat tempMat = biggestBlob.clone();
    cv::drawContours( tempMat, contours, largest_contour_index, cv::Scalar(0),
                  CV_FILLED, 8, hierarchy );

    //Erase the smaller blobs:
    biggestBlob = biggestBlob - tempMat;
    tempMat.release();
    return biggestBlob;
}

编辑:自从发布答案以来,我一直在学习 Python .这是等效于 C ++ 代码的 Python :

Since the posting of the answer, I've been learning Python. Here's the Python equivalent of the C++ code:

import cv2
import numpy as np

# Set image path
path = "D://opencvImages//"
fileName = "commentImage.png"

# Read Input image
inputImage = cv2.imread(path+fileName)

# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)

# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)

# Set kernel (structuring element) size:
kernelSize = (9, 6)

# Set operation iterations:
opIterations = 2

# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, kernelSize)

# Perform Dilate:
openingImage = cv2.morphologyEx(binaryImage, cv2.MORPH_DILATE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)

# Find the big contours/blobs on the filtered image:
biggestBlob = openingImage.copy()
contours, hierarchy = cv2.findContours(biggestBlob, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)

contoursPoly = [None] * len(contours)
boundRect = []

largestArea = 0
largestContourIndex = 0

# Loop through the contours, store the biggest one:
for i, c in enumerate(contours):

    # Get the area for the current contour:
    currentArea = cv2.contourArea(c, False)

    # Store the index of largest contour:
    if currentArea > largestArea:
        largestArea = currentArea
        largestContourIndex = i

# Once you get the biggest blob, paint it black:
tempMat = biggestBlob.copy()
# Draw the contours on the mask image:
cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)

# Erase the smaller blobs:
biggestBlob = biggestBlob - tempMat

# Generate the binary mask:
binaryMask = openingImage - biggestBlob

# Use the binaryMask to produce the final image:
resultImg = cv2.bitwise_and(inputImage, inputImage, mask = binaryMask)

cv2.imshow("Result", resultImg)
cv2.waitKey(0)

这篇关于Python:检测文本块并将其从图像中删除(OpenCV)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆