使用"VNImageHomographicAlignmentObservation"合并图像.班级 [英] Merge images using "VNImageHomographicAlignmentObservation" class

查看:110
本文介绍了使用"VNImageHomographicAlignmentObservation"合并图像.班级的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用VNImageHomographicAlignmentObservation合并两个图像,当前正在获取一个看起来像这样的3d矩阵:

I am trying to merge two images using VNImageHomographicAlignmentObservation, I am currently getting a 3d matrix that looks like this:

simd_float3x3([ [0.99229, -0.00451023, -4.32607e-07)],  
                [0.00431724,0.993118, 2.38839e-07)],   
                [-72.2425, -67.9966, 0.999288)]], )

但是我不知道如何使用这些值合并成一张图像.关于这些值的含义似乎没有任何文档.我在这里找到了有关转换矩阵的一些信息:使用矩阵.

But I don't know how to use these values to merge into one image. There doesn't seem to be any documentation on what these values even mean. I found some information on transformation matrices here: Working with matrices.

但是到目前为止,没有别的帮助我...有什么建议吗?

But so far nothing else has helped me... Any suggestions?

我的代码:

func setup() {

    let floatingImage = UIImage(named:"DJI_0333")!
    let referenceImage = UIImage(named: "DJI_0327")!

    let request = VNHomographicImageRegistrationRequest(targetedCGImage: floatingImage.cgImage!, options: [:])

    let handler = VNSequenceRequestHandler()
    try! handler.perform([request], on: referenceImage.cgImage!)

    if let results = request.results as? [VNImageHomographicAlignmentObservation] {
        print("Perspective warp found: \(results.count)")
        results.forEach { observation in
        // A matrix with 3 rows and 3 columns.                         
        let matrix = observation.warpTransform
        print(matrix) }
    }
}

推荐答案

此单应性矩阵H描述了如何将一个图像投影到另一个图像的图像平面上.要将每个像素转换为其投影位置,您可以使用同质坐标来计算其投影位置x' = H * x a>(基本上获取2D图像坐标,将1.0作为第三个分量,应用矩阵H,然后通过除以结果的第3个分量返回2D).

This homography matrix H describes how to project one of your images onto the image plane of the other image. To transform each pixel to its projected location, you can to compute its projected location x' = H * x using homogeneous coordinates (basically take your 2D image coordinate, add a 1.0 as third component, apply the matrix H, and go back to 2D by dividing through the 3rd component of the result).

对每个像素执行此操作的最有效方法是使用 CoreImage 在均匀空间中写入此矩阵乘法. CoreImage 提供了多种着色器内核类型:CIColorKernelCIWarpKernelCIKernel.对于此任务,我们只希望变换每个像素的位置,因此您需要CIWarpKernel.使用Core Image Shading Language,将如下所示:

The most efficient way to do this for every pixel, is to write this matrix multiplication in homogeneous space using CoreImage. CoreImage offers multiple shader kernel types: CIColorKernel, CIWarpKernel and CIKernel. For this task, we only want to transform the location of each pixel, so a CIWarpKernel is what you need. Using the Core Image Shading Language, that would look as follows:

import CoreImage
let warpKernel = CIWarpKernel(source:
    """
    kernel vec2 warp(mat3 homography)
    {
        vec3 homogen_in = vec3(destCoord().x, destCoord().y, 1.0); // create homogeneous coord
        vec3 homogen_out = homography * homogen_in; // transform by homography
        return homogen_out.xy / homogen_out.z; // back to normal 2D coordinate
    }
    """
)

请注意,着色器需要一个名为homographymat3,它与simd_float3x3矩阵H等效.调用着色器时,期望将矩阵存储在CIVector中,以使用以下方法对其进行转换:

Note that the shader wants a mat3 called homography, which is the shading language equivalent of the simd_float3x3 matrix H. When calling the shader, the matrix is expected to be stored in a CIVector, to transform it use:

let (col0, col1, col2) = yourHomography.columns
let homographyCIVector = CIVector(values:[CGFloat(col0.x), CGFloat(col0.y), CGFloat(col0.z),
                                             CGFloat(col1.x), CGFloat(col1.y), CGFloat(col1.z),
                                             CGFloat(col2.x), CGFloat(col2.y), CGFloat(col2.z)], count: 9)

CIWarpKernel应用于图像时,必须告诉 CoreImage 输出应该多大.要合并变形图像和参考图像,输出应足够大以覆盖整个投影的原始图像.我们可以通过将单应性应用于图像矩形的每个角来计算投影图像的大小(这次在Swift中,CoreImage将此矩形称为 extent ):

When you apply the CIWarpKernel to an image, you have to tell CoreImage how big the output should be. To merge the warped and reference image, the output should be big enough to cover the whole projected and original image. We can compute the size of the projected image by applying the homography to each corner of the image rect (this time in Swift, CoreImage calls this rect the extent):

/**
 * Convert a 2D point to a homogeneous coordinate, transform by the provided homography,
 * and convert back to a non-homogeneous 2D point.
 */
func transform(_ point:CGPoint, by homography:matrix_float3x3) -> CGPoint
{
  let inputPoint = float3(Float(point.x), Float(point.y), 1.0)
  var outputPoint = homography * inputPoint
  outputPoint /= outputPoint.z
  return CGPoint(x:CGFloat(outputPoint.x), y:CGFloat(outputPoint.y))
}

func computeExtentAfterTransforming(_ extent:CGRect, with homography:matrix_float3x3) -> CGRect
{
  let points = [transform(extent.origin, by: homography),
                transform(CGPoint(x: extent.origin.x + extent.width, y:extent.origin.y), by: homography),
                transform(CGPoint(x: extent.origin.x + extent.width, y:extent.origin.y + extent.height), by: homography),
                transform(CGPoint(x: extent.origin.x, y:extent.origin.y + extent.height), by: homography)]

  var (xmin, xmax, ymin, ymax) = (points[0].x, points[0].x, points[0].y, points[0].y)
  points.forEach { p in
    xmin = min(xmin, p.x)
    xmax = max(xmax, p.x)
    ymin = min(ymin, p.y)
    ymax = max(ymax, p.y)
  }
  let result = CGRect(x: xmin, y:ymin, width: xmax-xmin, height: ymax-ymin)
  return result
}

let warpedExtent = computeExtentAfterTransforming(ciFloatingImage.extent, with: homography.inverse)
let outputExtent = warpedExtent.union(ciFloatingImage.extent)

现在,您可以创建浮动图像的变形版本:

Now you can create a warped version of your floating image:

let ciFloatingImage = CIImage(image: floatingImage)
let ciWarpedImage = warpKernel.apply(extent: outputExtent, roiCallback:
    {
        (index, rect) in
        return computeExtentAfterTransforming(rect, with: homography.inverse)
    },
    image: inputImage,
    arguments: [homographyCIVector])!

roiCallback可以告诉 CoreImage 计算输入的特定部分需要输入图像的哪一部分. CoreImage使用此功能将着色器逐块应用于图像的各个部分,以便可以处理大型图像. (请参阅在苹果的文档中创建自定义过滤器).一个快速的技巧是总是在这里return CGRect.infinite,但是CoreImage不能做任何块式魔术.

The roiCallback is there to tell CoreImage which part of the input image is needed to compute a certain part of the output. CoreImage uses this to apply the shader on parts of the image block by block, such that it can process huge images. (See Creating Custom Filters in Apple's docs). A quick hack would be to always return CGRect.infinite here, but then CoreImage can't do any block-wise magic.

最后,创建参考图像和变形图像的合成图像:

And lastly, create a composite image of the reference image and the warped image:

let ciReferenceImage = CIImage(image: referenceImage)
let ciResultImage = ciWarpedImage.composited(over: ciReferenceImage)
let resultImage = UIImage(ciImage: ciResultImage)

这篇关于使用"VNImageHomographicAlignmentObservation"合并图像.班级的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆