通过指纹图像比较 [英] Image Comparison by Finger Printing

查看:190
本文介绍了通过指纹图像比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找通过指纹查找图像副本的方法。我知道这是通过在图像上应用哈希函数完成的,每个图像都有一个唯一的哈希值。

我对图像处理相当陌生,不知道关于哈希。我应该如何应用哈希函数并生成哈希值?



提前致谢

解决方案

你需要注意散列,一些图像格式,如JPEG和PNG,在图像中存储日期/时间和其他信息,这将使两个相同的图像看起来不同于普通的工具,例如 md5 cksum



下面是一个例子。使用 ImageMagick

 转换在终端的命令行中制作两个图像,两个相同的红色方块均为128x128 -size 128x128 xc:red a.png 
convert -size 128x128 xc:red b.png




现在查看他们的MD5总和:

  md5 [ab] .png 
MD5(a.png)= b4b82ba217f0b36e6d3ba1722f883e59
MD5(b


$ b $ p













$ b $

  cksum [ab] .png 
4158429075 290 a.png
3657683960 290 b.png

糟糕,它们根据 m不同而不同d5 cksum 。为什么?因为日期相隔1秒。



我建议您使用 ImageMagick 来校验和只是图片数据而不是元数据 - 当然,除非日期对你很重要:

  identify -format%#a.png 
e74164f4bab2dd8f7f612f8d2d77df17106bac77b9566aa888d31499e9cf8564

identify -format%#b.png
e74164f4bab2dd8f7f612f8d2d77df17106bac77b9566aa888d31499e9cf8564

现在它们都是相同的,因为图像是相同的 - 只是元数据不同。



当然,您可能更感兴趣感知散列 ,如果两个图像看起来相似,您就会明白。如果是这样,请查看此处



或者您可能有兴趣允许在亮度,方向或裁剪上稍有不同 - 这完全是另一个话题。

I'm looking for ways to find image duplicates by fingerprinting. I understand that this is done by applying hash functions on images, and each image would have a unique hash value.

I am fairly new to image processing and don't know much about hashing. How exactly am I supposed to apply hash functions and generate hash values?

Thanks in advance

解决方案

You need to be careful with hashing, some image formats, such as JPEG and PNG, store dates/times and other information within images and that will make two identical images appear to be different to normal tools such as md5 and cksum.

Here is an example. Make two images, both identical red squares of 128x128 at the command line in Terminal with ImageMagick

convert -size 128x128 xc:red a.png
convert -size 128x128 xc:red b.png

Now check their MD5 sums:

md5 [ab].png
MD5 (a.png) = b4b82ba217f0b36e6d3ba1722f883e59
MD5 (b.png) = 6aa398d3aaf026c597063c5b71b8bd1a

Or their checksums:

cksum [ab].png
4158429075 290 a.png
3657683960 290 b.png

Oops, they are different according to both md5 and cksum. Why? Because the dates are 1 second apart.

I would suggest you use ImageMagick to checksum "just the image data" and not the metadata - unless, of course, the date is important to you:

identify -format %# a.png
e74164f4bab2dd8f7f612f8d2d77df17106bac77b9566aa888d31499e9cf8564

identify -format %# b.png
e74164f4bab2dd8f7f612f8d2d77df17106bac77b9566aa888d31499e9cf8564

Now they are both the same, because the image is the same - just the metadata differs.

Of course, you may be more interested in "Perceptual Hashing" where you just get an idea if two images "look similar". If so, look here.

Or you may be interested in allowing slight differences in brightness, or orientation, or cropping - which is another topic altogether.

这篇关于通过指纹图像比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆