如何仅使用numpy和PIL检测图像翻译 [英] how to detect image translation with only numpy and PIL
问题描述
给出两个图像,我需要检测两个图像之间是否存在平移偏移.我只能使用numpy和PIL.
Given two images, I need to detect if there is a translation offset between the two. I am only able to use numpy and PIL.
这篇文章显示了如何使用PIL应用(x,y)翻译,但是还没有找到类似的方法来检测翻译.
This post shows how to apply an (x, y) translation with PIL, but haven't found something similar for how to detect the translation.
From what I've read, cross-correlation seems to be part of the solution, and there is the numpy.correlate function. However, I don't know how to use the output of this function to detect horizontal and vertical translation coordinates.
推荐答案
由于(几乎)二维数组,因此需要scipy.signal.correlate2d()
函数.
Since these are (almost) 2D arrays, you want the scipy.signal.correlate2d()
function.
首先,读取图像并将其转换为数组:
First, read your images and cast as arrays:
import numpy as np
from PIL import Image
import requests
import io
image1 = "https://i.stack.imgur.com/lf2lc.png"
image2 = "https://i.stack.imgur.com/MMSdM.png"
img1 = np.asarray(Image.open(io.BytesIO(requests.get(image1).content)))
img2 = np.asarray(Image.open(io.BytesIO(requests.get(image2).content)))
# img2 is greyscale; make it 2D by taking mean of channel values.
img2 = np.mean(img2, axis=-1)
Now we have the two images, we can adapt the example in the scipy.signal.correlate2d()
documentation:
from scipy import signal
corr = signal.correlate2d(img1, img2, mode='same')
如果出于某种原因要避免使用scipy
,则应该等效:
If you want to avoid using scipy
for some reason, then this should be equivalent:
pad = np.max(img1.shape) // 2
fft1 = np.fft.fft2(np.pad(img1, pad))
fft2 = np.fft.fft2(np.pad(img2, pad))
prod = fft1 * fft2.conj()
result_full = np.fft.fftshift(np.fft.ifft2(prod))
corr = result_full.real[1+pad:-pad+1, 1+pad:-pad+1]
现在我们可以计算最大相关性的位置:
Now we can compute the position of the maximum correlation:
y, x = np.unravel_index(np.argmax(corr), corr.shape)
现在我们可以可视化结果,再次修改文档示例:
Now we can visualize the result, again adapting the documentation example:
import matplotlib.pyplot as plt
x2, y2 = np.array(img2.shape) // 2
fig, (ax_img1, ax_img2, ax_corr) = plt.subplots(1, 3, figsize=(15, 5))
im = ax_img1.imshow(img1, cmap='gray')
ax_img1.set_title('img1')
ax_img2.imshow(img2, cmap='gray')
ax_img2.set_title('img2')
im = ax_corr.imshow(corr, cmap='viridis')
ax_corr.set_title('Cross-correlation')
ax_img1.plot(x, y, 'ro')
ax_img2.plot(x2, y2, 'go')
ax_corr.plot(x, y, 'ro')
fig.show()
绿点是img2
的中心.红点是放置绿点时给出最大相关性的位置.
The green point is the centre of img2
. The red point is the position at which placing the green point gives the maximum correlation.
这篇关于如何仅使用numpy和PIL检测图像翻译的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!