近乎重复的图像检测 [英] Near-Duplicate Image Detection

查看:25
本文介绍了近乎重复的图像检测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据图像之间的相似性对给定图像集进行排序的快速方法是什么?

What's a fast way to sort a given set of images by their similarity to each other.

目前我有一个系统可以在两个图像之间进行直方图分析,但这是一项非常昂贵的操作,而且似乎太过分了.

At the moment I have a system that does histogram analysis between two images, but this is a very expensive operation and seems too overkill.

最理想的情况是,我正在寻找一种算法,该算法可以为每个图像提供一个分数(例如一个整数分数,例如 RGB 平均值),我可以根据该分数进行排序.相同的分数或相邻的分数可能是重复的.

Optimally I am looking for a algorithm that would give each image a score (for example a integer score, such as the RGB Average) and I can just sort by that score. Identical Scores or scores next to each other are possible duplicates.

0299393
0599483
0499994 <- possible dupe
0499999 <- possible dupe
1002039
4995994
6004994 

每张图像的 RGB 平均值很糟糕,有没有类似的东西?

RGB Average per image sucks, is there something similar?

推荐答案

已经有很多关于图像搜索和相似性度量的研究.这不是一个容易的问题.通常,单个 int 不足以确定图像是否非常相似.你的假阳性率会很高.

There has been a lot of research on image searching and similarity measures. It's not an easy problem. In general, a single int won't be enough to determine if images are very similar. You'll have a high false-positive rate.

但是,由于已经完成了大量研究,您可以查看其中的一些.例如,这篇论文 (PDF) 给出了一种紧凑的图像指纹算法,适用于快速查找重复图像且无需存储大量数据.如果您想要一些强大的东西,这似乎是正确的方法.

However, since there has been a lot of research done, you might take a look at some of it. For example, this paper (PDF) gives a compact image fingerprinting algorithm that is suitable for finding duplicate images quickly and without storing much data. It seems like this is the right approach if you want something robust.

如果您正在寻找更简单但绝对更临时的东西,this SO question 有一些不错的想法.

If you're looking for something simpler, but definitely more ad-hoc, this SO question has a few decent ideas.

这篇关于近乎重复的图像检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆