文字的全景拼接 [英] Panorama stitching for text

查看:105
本文介绍了文字的全景拼接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一个好的文本全景拼接库.我尝试了 OpenCV

I'm looking for a good panorama stitching library for text. I tried OpenCV and OpenPano. They both work good on regular photos, but fail on text. For example I need to stitch the following 3 images:

图像之间有大约45%的重叠.

The images have about 45% overlapping between each other.

如果可以选择使提到的一个库在文本图像上正常工作,而不是寻找另一个库,那就太好了.

If there's an option to make one of the mentioned libraries work good on text images, instead of finding another library, that would be great.

  • 我需要该库才能在linux arm上工作.

推荐答案

OpenPano 无法拼接文本因为它无法检索足够的特征点(或关键点)来进行拼接过程.

OpenPano fails at stitching text because it cannot retrieve enough feature points (or keypoints) to do the stitching process.

文字拼接不需要一种对旋转稳定的匹配方法,而仅对翻译有用. OpenCV 方便地提供了这样的功能.它被称为: 模板匹配 .

Text stitching doesn't need a matching method that is robust to rotations but only to translations. OpenCV conveniently offers such a function. It is called : Template Matching.

我将开发的解决方案基于此OpenCV的功能.

The solution I will develop is based on this OpenCV's feature.

我现在将说明解决方案的主要步骤(有关更多详细信息,请查看下面提供的代码).

I will now explain the main steps of my solution (for further details, please have a look at the code provided bellow).

为了匹配两个连续的图像(在matchImages函数中完成,请参见下面的代码):

In order to match two consecutive images (done in the matchImages function, see code bellow):

  1. 我们通过获取第一张图片的45%(H_templ_ratio)来创建 template 图片,如下所示:
  1. We create a template image by taking 45% (H_templ_ratio) of the first image as depicted bellow:

这一步是在我的代码中通过功能genTemplate完成的.

This step is done in my code by the function genTemplate.

  1. 我们在第二张图片(我们要在其中找到 template 的图片)上添加黑色边距.如果输入图像中的文本未对齐,则必须执行此步骤(不过,在这些示例图像中就是这种情况).这是边距处理后图像的外观.如您所见,仅在图像下方和上方需要留有空白:
  1. We add black margins to the second image (where we want to find the template). This step is necessary if the text is not aligned in the input images (this is the case on these sample images though). Here is what the image looks like after the margin process. As you can see, the margins are only needed bellow and above the image:

模板图片理论上可以在此空白图片的任何位置找到.此过程在addBlackMargins函数中完成.

The template image could theoretically be found anywhere in this margined image. This process is done in the addBlackMargins function.

  1. 我们应用了 canny过滤器 template 图像和我们要查找它的图像上(在Mat2Edges函数内部完成).这会将鲁棒性添加到匹配过程.这是一个示例:
  1. We apply a canny filter on both the template image and the image where we want to find it (done inside the Mat2Edges function). This will add robustness to the matching process. Here is an example:

  1. 我们使用 minMaxLoc 函数.

计算最终图像尺寸

此步骤包括计算最终矩阵大小,在这里我们将所有图像拼接在一起.如果所有输入图像的高度都不相同,则特别需要

Calculating final image size

This step consists in calculating the size of the final matrix where we will stitch all the images together. This is particularly needed if all the input images don't have the same height.

此步骤在calcFinalImgSize函数内部完成.我不会在这里讨论太多细节,因为尽管看起来有点复杂(至少对我而言),但这只是简单的数学运算(加法,减法,乘法).如果想了解公式,可以用笔和纸.

This step is done inside the calcFinalImgSize function. I won't get into to much details here because even though it looks a bit complex (for me at least), this is only simple maths (additions, subtractions, multiplications). Take a pen and paper if you want to understand the formulas.

一旦每个输入图像都有匹配位置,我们只需做简单的数学运算即可复制右侧的输入图像 最终图像的位置.再次,我建议您检查代码以了解实现的详细信息(请参见stitchImages函数).

Once we have the match locations for each input images, we only have to do simple maths to copy the input images in the right spot of the final image. Again, I recommend you to check the code for implementation details (see stitchImages function).

这是输入图像的结果:

如您所见,结果不是"像素完美",但对于 OCR .

As you can see, the result is not "pixel perfect" but it should be good enough for OCR.

这是输入具有不同高度的图像的另一个结果:

And here is another result with input images of different heights:

我的程序是用Python编写的,并且使用 cv2 (OpenCV)和 numpy 模块.但是,可以(轻松)将其移植到其他语言中,例如 C ++ Java C#.

My program is written in Python and uses cv2 (OpenCV) and numpy modules. However it can be ported (easily) in other languages such as C++, Java and C#.

import numpy as np
import cv2

def genTemplate(img): 
    global H_templ_ratio
    # we get the image's width and height
    h, w = img.shape[:2]
    # we compute the template's bounds
    x1 = int(float(w)*(1-H_templ_ratio))
    y1 = 0
    x2 = w
    y2 = h
    return(img[y1:y2,x1:x2]) # and crop the input image

def mat2Edges(img): # applies a Canny filter to get the edges
    edged = cv2.Canny(img, 100, 200)
    return(edged)

def addBlackMargins(img, top, bottom, left, right): # top, bottom, left, right: margins width in pixels
    h, w = img.shape[:2]
    result = np.zeros((h+top+bottom, w+left+right, 3), np.uint8)
    result[top:top+h,left:left+w] = img
    return(result)

# return the y_offset of the first image to stitch and the final image size needed
def calcFinalImgSize(imgs, loc):
    global V_templ_ratio, H_templ_ratio
    y_offset = 0
    max_margin_top = 0; max_margin_bottom = 0 # maximum margins that will be needed above and bellow the first image in order to stitch all the images into one mat
    current_margin_top = 0; current_margin_bottom = 0

    h_init, w_init = imgs[0].shape[:2]
    w_final = w_init

    for i in range(0,len(loc)):
        h, w = imgs[i].shape[:2]
        h2, w2 = imgs[i+1].shape[:2]
        # we compute the max top/bottom margins that will be needed (relatively to the first input image) in order to stitch all the images
        current_margin_top += loc[i][1] # here, we assume that the template top-left corner Y-coordinate is 0 (relatively to its original image)
        current_margin_bottom += (h2 - loc[i][1]) - h
        if(current_margin_top > max_margin_top): max_margin_top = current_margin_top
        if(current_margin_bottom > max_margin_bottom): max_margin_bottom = current_margin_bottom
        # we compute the width needed for the final result
        x_templ = int(float(w)*H_templ_ratio) # x-coordinate of the template relatively to its original image
        w_final += (w2 - x_templ - loc[i][0]) # width needed to stitch all the images into one mat

    h_final = h_init + max_margin_top + max_margin_bottom
    return (max_margin_top, h_final, w_final)

# match each input image with its following image (1->2, 2->3) 
def matchImages(imgs, templates_loc):
    for i in range(0,len(imgs)-1):
        template = genTemplate(imgs[i])
        template = mat2Edges(template)
        h_templ, w_templ = template.shape[:2]
        # Apply template Matching
        margin_top = margin_bottom = h_templ; margin_left = margin_right = 0
        img = addBlackMargins(imgs[i+1],margin_top, margin_bottom, margin_left, margin_right) # we need to enlarge the input image prior to call matchTemplate (template needs to be strictly smaller than the input image)
        img = mat2Edges(img)
        res = cv2.matchTemplate(img,template,cv2.TM_CCOEFF) # matching function
        _, _, _, templ_pos = cv2.minMaxLoc(res) # minMaxLoc gets the best match position
        # as we added margins to the input image we need to subtract the margins width to get the template position relatively to the initial input image (without the black margins)
        rectified_templ_pos = (templ_pos[0]-margin_left, templ_pos[1]-margin_top) 
        templates_loc.append(rectified_templ_pos)
        print("max_loc", rectified_templ_pos)

def stitchImages(imgs, templates_loc):
    y_offset, h_final, w_final = calcFinalImgSize(imgs, templates_loc) # we calculate the "surface" needed to stitch all the images into one mat (and y_offset, the Y offset of the first image to be stitched) 
    result = np.zeros((h_final, w_final, 3), np.uint8)

    #initial stitch
    h_init, w_init = imgs[0].shape[:2]
    result[y_offset:y_offset+h_init, 0:w_init] = imgs[0]
    origin = (y_offset, 0) # top-left corner of the last stitched image (y,x)
    # stitching loop
    for j in range(0,len(templates_loc)):
        h, w = imgs[j].shape[:2]
        h2, w2 = imgs[j+1].shape[:2]
        # we compute the coordinates where to stitch imgs[j+1]
        y1 = origin[0] - templates_loc[j][1]
        y2 = origin[0] - templates_loc[j][1] + h2
        x_templ = int(float(w)*(1-H_templ_ratio)) # x-coordinate of the template relatively to its original image's right side
        x1 = origin[1] + x_templ - templates_loc[j][0]
        x2 = origin[1] + x_templ - templates_loc[j][0] + w2
        result[y1:y2, x1:x2] = imgs[j+1] # we copy the input image into the result mat
        origin = (y1,x1) # we update the origin point with the last stitched image

    return(result)

if __name__ == '__main__':

    # input images
    part1 = cv2.imread('part1.jpg')
    part2 = cv2.imread('part2.jpg')
    part3 = cv2.imread('part3.jpg')
    imgs = [part1, part2, part3]

    H_templ_ratio = 0.45 # H_templ_ratio: horizontal ratio of the input that we will keep to create a template
    templates_loc = [] # templates location

    matchImages(imgs, templates_loc)

    result = stitchImages(imgs, templates_loc)

    cv2.imshow("result", result)

这篇关于文字的全景拼接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆