使用python和互相关进行图像配准 [英] Image registration using python and cross-correlation

查看:1961
本文介绍了使用python和互相关进行图像配准的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两张图片显示相同的内容:2D高斯形斑点。我把这两个16位的png文件称为left.png和right.png。但是,通过略微不同的光学设置获得它们,相应的斑点(物理上相同)出现在略微不同的位置。意味着右边以非线性方式略微拉伸,扭曲或左右。因此我想从左到右进行转换。

I got two images showing exaktly the same content: 2D-gaussian-shaped spots. I call these two 16-bit png-files "left.png" and "right.png". But as they are obtained thru an slightly different optical setup, the corresponding spots (physically the same) appear at slightly different positions. Meaning the right is slightly stretched, distorted, or so, in a non-linear way. Therefore I would like to get the transformation from left to right.

所以对于左侧的每个像素都有x和y坐标,我想要一个函数给我位移向量的组成部分指向右侧的相应像素。

So for every pixel on the left side with its x- and y-coordinate I want a function giving me the components of the displacement-vector that points to the corresponding pixel on the right side.

在前一种方法中,我试图获取相应位置的位置以获得相对距离deltaX和deltaY。这些距离然后我适应了泰勒展开到T(x,y)的二阶,给出了左边每个像素(x,y)的位移矢量的x和y分量,指向相应的像素(x',y')在右边。

In a former approach I tried to get the positions of the corresponding spots to obtain the relative distances deltaX and deltaY. These distances then I fitted to the taylor-expansion up to second order of T(x,y) giving me the x- and y-component of the displacement vector for every pixel (x,y) on the left, pointing to corresponding pixel (x',y') on the right.

为了得到更一般的结果,我想使用归一化的互相关。为此,我将左边的每个像素值与右边的相应像素值相乘,并将这些乘积相加。我正在寻找的转换应该连接最大化总和的像素。因此,当总和最大化时,我知道我乘以相应的像素。

To get a more general result I would like to use normalized cross-correlation. For this I multiply every pixelvalue from left with a corresponding pixelvalue from right and sum over these products. The transformation I am looking for should connect the pixels that will maximize the sum. So when the sum is maximzied, I know that I multiplied the corresponding pixels.

我真的尝试了很多,但没有管理。我的问题是,如果你有人有想法或做过类似的事情。

I really tried a lot with this, but didn't manage. My question is if somebody of you has an idea or has ever done something similar.

import numpy as np
import Image

left = np.array(Image.open('left.png'))
right = np.array(Image.open('right.png'))

# for normalization (http://en.wikipedia.org/wiki/Cross-correlation#Normalized_cross-correlation)    
left = (left - left.mean()) / left.std()
right = (right - right.mean()) / right.std()

如果我能更清楚地说明这个问题,请告诉我。我仍然需要查看如何使用latex发布问题。

Please let me know if I can make this question more clear. I still have to check out how to post questions using latex.

非常感谢你输入。

[left.png] http://i.stack.imgur.com/oSTER.png
[right.png] http://i.stack.imgur.com/Njahj.png

[left.png] http://i.stack.imgur.com/oSTER.png [right.png] http://i.stack.imgur.com/Njahj.png

我担心,大多数情况下16-位图像只显示为黑色(至少在我使用的系统上):(但当然有数据存在。

I'm afraid, in most cases 16-bit images appear just black (at least on systems I use) :( but of course there is data in there.

我试着澄清我的问题。我正在寻找一个矢量场,其位移矢量从left.png中的每个像素指向right.png 中的相应像素。我的问题是,我不确定我的限制。

I try to clearify my question. I am looking for a vector-field with displacement-vectors that point from every pixel in left.png to the corresponding pixel in right.png. My problem is, that I am not sure about the constraints I have.

$ \vec{r} + \vec{d}(\vec{r}) = \vec{r}\prime $

其中vector r (分量x和y)指向left.png中的像素,矢量r-prime(分量x-prime和y-prime)指向right.png中的对应像素。每个r都有一个位移矢量。

where vector r (components x and y) points to a pixel in left.png and vector r-prime (components x-prime and y-prime) points to the corresponding pixel in right.png. for every r there is a displacement-vector.

我之前做过的是,我手动找到了矢量场d的组件并将它们拟合到第二度的多项式:

What I did earlier was, that I found manually components of vector-field d and fitted them to a polynom second degree:

$ \left(\begin{array}{c}x \\ y\end{array}\right) + \left(\begin{array}{c}\Delta x(x,y) \\ \Delta y(x,y)\end{array}\right)=\left(\begin{array}{c}x\prime \\ y\prime \end{array}\right) $

所以我适合:

$ \Delta x(x,y) = K_0 + K_1\cdot x + K_2 \cdot y + K_3 \cdot x^2 + K_4 \cdot xy + K_5 \cdot y^2 $

$ \Delta y(x,y) = K_6 + K_7\cdot x + K_8 \cdot y + K_9 \cdot x^2 + K_{10} \cdot xy + K_{11} \cdot y^2 $

这对你有意义吗?是否有可能得到所有delta-x(x,y)和delta-y(x,y)的互相关?如果相应的像素通过位移矢量链接在一起,那么互相关应该最大化,对吧?

Does this make sense to you? Is it possible to get all the delta-x(x,y) and delta-y(x,y) with cross-correlation? The cross-correlation should be maximized if the corresponding pixels are linked together thru the displacement-vectors, right?

所以我想到的算法如下:

So the algorithm I was thinking of is as follows:


  1. Deform right.png

  2. 获取互相关的值

  3. 进一步变形right.png

  4. 获取互相关的值并与之前的值进行比较

  5. 如果它更大,那么好的变形,如果没有,重做变形并做其他事情

  6. 在最大化互相关值之后,知道有什么变形: )

  1. Deform right.png
  2. Get the value of cross-correlation
  3. Deform right.png further
  4. Get the value of cross-correlation and compare to value before
  5. If it's greater, good deformation, if not, redo deformation and do something else
  6. After maximzied the cross-correlation value, know what deformation there is :)

关于变形:可以首先沿x和y方向移动以最大化互相关,然后在一秒钟内步骤拉伸或压缩x和y依赖,并在第三步骤变形二次x和y依赖并重复此过程iterativ ??用整数坐标我真的有问题。你认为我必须插入图片以获得连续分布?我必须再考虑一下:(感谢大家参与:)

About deformation: could one do first a shift along x- and y-direction to maximize cross-correlation, then in a second step stretch or compress x- and y-dependant and in a third step deform quadratic x- and y-dependent and repeat this procedure iterativ?? I really have a problem to do this with integer-coordinates. Do you think I would have to interpolate the picture to obtain a continuous distribution?? I have to think about this again :( Thanks to everybody for taking part :)

推荐答案

OpenCV(和它一起使用python Opencv绑定)有一个 StarDetector 类实现此算法

OpenCV (and with it the python Opencv binding) has a StarDetector class which implements this algorithm.

作为替代方案,您可以查看OpenCV SIFT 类,代表Scale Invariant Feature Transform。

As an alternative you might have a look at the OpenCV SIFT class, which stands for Scale Invariant Feature Transform.

更新

关于您的评论,我理解正确转型将最大化图像之间的互相关,但我不明白你如何选择最大化的变换集。也许你知道三个匹配点的坐标(通过一些启发式或通过手工选择它们),如果你期望亲和力,你可以使用类似 cv2.getAffineTransform 为您的最大化过程进行良好的初始转换。从那里你可以使用一些小的额外转换来设置一个最大化的集合。但是这种方法在我看来似乎重新发明了SIFT可以处理的东西。

Regarding your comment, I understand that the "right" transformation will maximize the cross-correlation between the images, but I don't understand how you choose the set of transformations over which to maximize. Maybe if you know the coordinates of three matching points (either by some heuristics or by choosing them by hand), and if you expect affinity, you could use something like cv2.getAffineTransform to have a good initial transformation for your maximization process. From there you could use small additional transformations to have a set over which to maximize. But this approach seems to me like re-inventing something which SIFT could take care of.

要实际转换测试图像,你可以使用 cv2.warpAffine ,它也可以处理边界值(例如,填充0)。要计算互相关,可以使用 scipy.signal.correlate2d

To actually transform your test image you can use cv2.warpAffine, which also can take care of border values (e.g. pad with 0). To calculate the cross-correlation you could use scipy.signal.correlate2d.

更新

您的最新更新确实为我澄清了一些观点。但我认为位移的矢量场不是最自然的东西,这也是误解的来源。我正在考虑更多的全局转换T,它应用于左图像的任何点(x,y)给出(x',y')= T(x,y)on右侧,但T对每个像素都有相同的分析形式。例如,这可能是位移,旋转,缩放,也许是某些透视变换的组合。我不能说希望找到这样的转变是否现实,这取决于你的设置,但如果双方的场景在物理上是相同的,我会说期望一些仿射变换是合理的。这就是我建议 cv2.getAffineTransform 的原因。从这样的T计算位移矢量场当然是微不足道的,因为这只是T(x,y) - (x,y)。

Your latest update did indeed clarify some points for me. But I think that a vector field of displacements is not the most natural thing to look for, and this is also where the misunderstanding came from. I was thinking more along the lines of a global transformation T, which applied to any point (x,y) of the left image gives (x',y')=T(x,y) on the right side, but T has the same analytical form for every pixel. For example, this could be a combination of a displacement, rotation, scaling, maybe some perspective transformation. I cannot say whether it is realistic or not to hope to find such a transformation, this depends on your setup, but if the scene is physically the same on both sides I would say it is reasonable to expect some affine transformation. This is why I suggested cv2.getAffineTransform. It is of course trivial to calculate your displacement Vector field from such a T, as this is just T(x,y)-(x,y).

大优点是你的转换只有很少的自由度,而不是我认为,位移矢量场中的2N自由度,其中N是亮点的数量。

The big advantage would be that you have only very few degrees of freedom for your transformation, instead of, I would argue, 2N degrees of freedom in the displacement vector field, where N is the number of bright spots.

如果它确实是一个仿射变换,我会建议这样的算法:

If it is indeed an affine transformation, I would suggest some algorithm like this:

  • identify three bright and well isolated spots on the left
  • for each of these three spots, define a bounding box so that you can hope to identify the corresponding spot within it in the right image
  • find the coordinates of the corresponding spots, e.g. with some correlation method as implemented in cv2.matchTemplate or by also just finding the brightest spot within the bounding box.
  • once you have three matching pairs of coordinates, calculate the affine transformation which transforms one set into the other with cv2.getAffineTransform.
  • apply this affine transformation to the left image, as a check if you found the right one you could calculate if the overall normalized cross-correlation is above some threshold or drops significantly if you displace one image with respect to the other.
  • if you wish and still need it, calculate the displacement vector field trivially from your transformation T.

更新

似乎 cv2.getAffineTransform 需要一个笨拙的输入数据类型float32。假设源坐标是(sxi,syi)和目的地(dxi,dyi),带 i = 0,1,2 ,那么你需要的是

It seems cv2.getAffineTransform expects an awkward input data type 'float32'. Let's assume the source coordinates are (sxi,syi) and destination (dxi,dyi) with i=0,1,2, then what you need is

src = np.array( ((sx0,sy0),(sx1,sy1),(sx2,sy2)), dtype='float32' )
dst = np.array( ((dx0,dy0),(dx1,dy1),(dx2,dy2)), dtype='float32' )

result = cv2.getAffineTransform(src,dst)

这篇关于使用python和互相关进行图像配准的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆