图像散列用于什么? [英] What is image hashing used for?

查看:34
本文介绍了图像散列用于什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有时会听到这个词,想知道它是做什么用的?

I hear this term sometimes and am wondering what it is used for?

推荐答案

哈希 是一个函数适用于任意数据并产生固定大小的数据(主要是非常小的大小).有许多不同类型的散列,但如果我们谈论图像散列,它用于:

Hashing is a function that applies to an arbitrary data and produces the data of a fixed size (mostly a very small size). There are many different types of hashes, but if we are talking about image hashing, it is used either to:

  • 快速查找重复项.几乎任何散列函数都可以工作.您将查找图像的哈希值,而不是搜索整个图像.
  • 寻找相似的图片,我稍后会解释

与我们看起来相同的图像,如果您只比较原始字节,则可能会有很大不同.这可能是由于:

Images that look identical to us, can be very different if you will just compare the raw bytes. This can be due to:

  • 调整大小
  • 旋转
  • 颜色伽玛略有不同
  • 不同的格式
  • 一些轻微的噪音、水印和伪影

即使您会发现一个图像只有一个字节不同,如果您对其应用哈希函数,结果也可能非常不同(对于像 MD5, SHA 它很可能会完全不同).

Even if you will find an image that will be different just in one byte, if you will apply a hash function to it, the result can be very different (for hashes like MD5, SHA it most probably will be completely different).

所以你需要一个散列函数来为相似的图像创建一个相似(甚至相同)的散列.其中一种通用的方法是局部敏感哈希.但是我们知道图像会出现什么样的问题,所以我们可以想出一个更专业的哈希类型.

So you need a hash function which will create a similar (or even identical) hash for similar images. One of the generic ones is locality sensitive hashing. But we know what kind of problems can be with images, so we can come up with a more specialized kind of hash.

最著名的算法是:

  • a-hash. Average hashing is the simplest algorithm which uses only a few transformation. Scale the image, convert to greyscale, calculate the mean and binarize the greyscale based on the mean. Now convert the binary image into the integer. The algorithm is so simple that you can implement it in an hour.
  • p-hash. Perceptual hash uses similar approach but instead of averaging relies on discrete cosine transformation (popular transformation in signal processing).
  • d-hash. Difference hash uses the same approach as a-hash, but instead of using information about average values, it uses gradients (difference between adjacent pixels).
  • w-hash. Very similar to p-hash, but instead of DCT it uses wavelet transformation.

顺便说一句,如果你使用 python,所有这些哈希都已经在这个库中实现了.

By the way, if you use python, all these hashes are already implemented in this library.

这篇关于图像散列用于什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆