近似重复图像检测 [英] Near-Duplicate Image Detection

查看:129
本文介绍了近似重复图像检测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通过它们彼此的相似性来对一组给定图像进行排序的快速方法是什么。

What's a fast way to sort a given set of images by their similarity to each other.

目前我有一个系统可以在两个图像之间进行直方图分析,但这是一个非常昂贵的操作,似乎太过分了。

At the moment I have a system that does histogram analysis between two images, but this is a very expensive operation and seems too overkill.

最理想的是,我正在寻找能够为每个图像提供分数的算法(例如整数分数,例如RGB平均分),我可以按该分数排序。相同的分数或分数可能是重复的。

Optimally I am looking for a algorithm that would give each image a score (for example a integer score, such as the RGB Average) and I can just sort by that score. Identical Scores or scores next to each other are possible duplicates.

0299393
0599483
0499994 <- possible dupe
0499999 <- possible dupe
1002039
4995994
6004994 

RGB每张图片的平均值很糟糕,有类似的东西吗?

RGB Average per image sucks, is there something similar?

推荐答案

已经有很多研究图像搜索和相似性度量。这不是一个容易的问题。通常,单个 int 将不足以确定图像是否非常相似。你的假阳性率很高。

There has been a lot of research on image searching and similarity measures. It's not an easy problem. In general, a single int won't be enough to determine if images are very similar. You'll have a high false-positive rate.

然而,由于已经进行了大量的研究,你可能会看一些它。例如,本文(PDF )提供了一种紧凑的图像指纹识别算法,适用于快速查找重复图像而无需存储大量数据。如果你想要一些强大的东西,似乎这是正确的方法。

However, since there has been a lot of research done, you might take a look at some of it. For example, this paper (PDF) gives a compact image fingerprinting algorithm that is suitable for finding duplicate images quickly and without storing much data. It seems like this is the right approach if you want something robust.

如果你正在寻找更简单的东西,但肯定更多的广告 - 在这个问题上,这个问题有一些不错的想法。

If you're looking for something simpler, but definitely more ad-hoc, this SO question has a few decent ideas.

这篇关于近似重复图像检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆