算法来比较两个图像 [英] Algorithm to compare two images

查看:260
本文介绍了算法来比较两个图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于两个不同的图像文件(在任何格式我选择),我需要写一个程序predict如果一个人被另一个人的非法复制的机会。副本的作者可能做东西像旋转,使消极,或添加琐碎细节(以及改变图像的尺寸)。

Given two different image files (in whatever format I choose), I need to write a program to predict the chance if one being the illegal copy of another. The author of the copy may do stuff like rotating, making negative, or adding trivial details (as well as changing the dimension of the image).

你知道任何算法做这样的工作吗?

Do you know any algorithm to do this kind of job?

推荐答案

这些都是简单的想法我已经在思考这个问题,从来没有尝试过,但我喜欢思考这样的问题!

These are simply ideas I've had thinking about the problem, never tried it but I like thinking about problems like this!

开始之前

考虑正常化图片,如果有一个更高的分辨率比其它,考虑选项,其中的一个是另一个的一个COM pressed版本,因此缩放所述分辨率下可能提供更准确的结果。

Consider normalising the pictures, if one is a higher resolution than the other, consider the option that one of them is a compressed version of the other, therefore scaling the resolution down might provide more accurate results.

考虑扫描可能会重新图像和各种位置和旋转的present放大部分图像的各种远景区。它开始变得棘手,如果图像之一是一个又一个扭曲的版本,这些都是限制的排序,你应该确定和妥协的。

Consider scanning various prospective areas of the image that could represent zoomed portions of the image and various positions and rotations. It starts getting tricky if one of the images are a skewed version of another, these are the sort of limitations you should identify and compromise on.

Matlab的是一个很好的工具,用于测试和评估图像。

Matlab is an excellent tool for testing and evaluating images.

测试算法

您应该测试(在最低)大量人分析了设置在这里的比赛是预先知道的测试数据。例如,如果在您的测试数据,你有1000的图像,其中其中5%的匹配,你现在有一个相当可靠的基准。一种算法,发现10%的阳性是不如一个发现阳性的4%,在我们的测试数据。然而,一个算法可能会发现所有的比赛,但也有一个巨大的20%,假阳性率,所以有几种方法来评价你的算法。

You should test (at the minimum) a large human analysed set of test data where matches are known beforehand. If for example in your test data you have 1,000 images where 5% of them match, you now have a reasonably reliable benchmark. An algorithm that finds 10% positives is not as good as one that finds 4% of positives in our test data. However, one algorithm may find all the matches, but also have a large 20% false positive rate, so there are several ways to rate your algorithms.

测试数据应该尝试设计涵盖多种类型的动态越好,你会期望找到在现实世界中。

The test data should attempt to be designed to cover as many types of dynamics as possible that you would expect to find in the real world.

要注意的是每个算法是有用的,必须执行比随机猜测好是很重要的,否则是对我们没用!

It is important to note that each algorithm to be useful must perform better than random guessing, otherwise it is useless to us!

可以然后软件应用到现实世界以受控的方式,并开始进行分析产生的结果。这是一个软件项目,该项目可以持续无穷,总有调整和改进,可以使的那种,它在设计的时候,因为它很容易陷入无休止项目的陷阱铭记这一点是很重要的。

You can then apply your software into the real world in a controlled way and start to analyse the results it produces. This is the sort of software project which can go on for infinitum, there are always tweaks and improvements you can make, it is important to bear that in mind when designing it as it is easy to fall into the trap of the never ending project.

颜色桶

通过两张图片,扫描每个像素和计数的颜色。例如,您可能有水桶:

With two pictures, scan each pixel and count the colours. For example you might have the 'buckets':

white
red
blue
green
black

(很明显,你将有一个更高的分辨率计数器)。每当你找到一个红色的像素,你增加了红色的柜台。每个桶可重复的色彩频谱presentative,更高分辨率的更准确,但你应该尝试一个可以接受的差率。

(Obviously you would have a higher resolution of counters). Every time you find a 'red' pixel, you increment the red counter. Each bucket can be representative of spectrum of colours, the higher resolution the more accurate but you should experiment with an acceptable difference rate.

一旦你有你的总数,它比较总计为第二个图像。你可能会发现,每个图像具有相当独特的足迹,足以找到匹配。

Once you have your totals, compare it to the totals for a second image. You might find that each image has a fairly unique footprint, enough to identify matches.

边缘检测

如何使用边缘检测替代文字

How about using Edge Detection.

使用两个类似的图片边缘检测应该为你提供一个可用的和相当可靠独特的足迹。

With two similar pictures edge detection should provide you with a usable and fairly reliable unique footprint.

以图文并茂,并应用边缘检测。也许测量的边缘的平均厚度,然后计算此图像可以被缩放的概率,并在必要时重新调整。下面是所施加 Gabor滤波器的(一种类型的边缘检测的)在各种旋转的一个例子。

Take both pictures, and apply edge detection. Maybe measure the average thickness of the edges and then calculate the probability the image could be scaled, and rescale if necessary. Below is an example of an applied Gabor Filter (a type of edge detection) in various rotations.

比较的图片像素对像素,算上比赛和非比赛。如果他们是错误的某一阈值内,则有一个匹配。否则,你可以尝试降低分辨率达到某一点,看看是否对比赛的可能性提高。

Compare the pictures pixel for pixel, count the matches and the non matches. If they are within a certain threshold of error, you have a match. Otherwise, you could try reducing the resolution up to a certain point and see if the probability of a match improves.

兴趣地区

有些图像可能有利益鲜明的段/地区。这些地区可能高对比度图像的其余部分,并寻找在其他图像来寻找匹配的好项目。拿这个形象的比方:

Some images may have distinctive segments/regions of interest. These regions probably contrast highly with the rest of the image, and are a good item to search for in your other images to find matches. Take this image for example:

的建筑工人在蓝色是感兴趣的区域,可以用来作为搜索对象。可能有几种方法,你可以从关注区域提取属性/数据,并利用它们来搜索你的数据集。

The construction worker in blue is a region of interest and can be used as a search object. There are probably several ways you could extract properties/data from this region of interest and use them to search your data set.

如果您有兴趣超过2个区域,可以测量它们之间的距离。就拿这个简单的例子:

If you have more than 2 regions of interest, you can measure the distances between them. Take this simplified example:

我们有兴趣3清晰的区域。区域1和2之间的距离可以是200个像素,1和3 400像素,以及2和3 200像素之间。

We have 3 clear regions of interest. The distance between region 1 and 2 may be 200 pixels, between 1 and 3 400 pixels, and 2 and 3 200 pixels.

搜寻其他图像感兴趣相似区域,规范的距离值,看看是否有潜在的匹配。这种技术可以很好的工作旋转和缩放图像。感兴趣的更多区域你有,对比赛的增加,因为每个距离测量的概率相匹配。

Search other images for similar regions of interest, normalise the distance values and see if you have potential matches. This technique could work well for rotated and scaled images. The more regions of interest you have, the probability of a match increases as each distance measurement matches.

要想想你的数据集的情况下是很重要的。例如,如果你的数据集是现代艺术,那么感兴趣的区域会工作得非常好,为地区的利益很可能的设计的最终图像的基本组成部分。然而,如果你处理的是建筑工地的图像,感兴趣区域可能是通过非法复印机PTED丑跨$ P $和可裁剪/编辑了宽松。请你的数据集的心中共同的特点,并试图利用这些知识。

It is important to think about the context of your data set. If for example your data set is modern art, then regions of interest would work quite well, as regions of interest were probably designed to be a fundamental part of the final image. If however you are dealing with images of construction sites, regions of interest may be interpreted by the illegal copier as ugly and may be cropped/edited out liberally. Keep in mind common features of your dataset, and attempt to exploit that knowledge.

变形

变形的两个图像是转动一个图像到另一个通过一组步骤的过程:

Morphing two images is the process of turning one image into the other through a set of steps:

替代文字

请注意,这是不同的褪色一个图像到另一个!

Note, this is different to fading one image into another!

有许多软件包,可变形的图像。它traditionaly用作转变效果,两个图像不变形弄成中途通常,一个极端顶点变形到另一个极端作为最终结果。

There are many software packages that can morph images. It's traditionaly used as a transitional effect, two images don't morph into something halfway usually, one extreme morphs into the other extreme as the final result.

为什么会这样有用吗?家属所使用的变形算法,可能有图像的相似性之间的关系,并且所述变形算法的一些参数。

Why could this be useful? Dependant on the morphing algorithm you use, there may be a relationship between similarity of images, and some parameters of the morphing algorithm.

在一个非常简单的上例如,一种算法可能会执行,当有较少进行快速变化。然后,我们知道有一个更高的可能性,这两个图片分享彼此的属性。

In a grossly over simplified example, one algorithm might execute faster when there are less changes to be made. We then know there is a higher probability that these two images share properties with each other.

该技术的可以的的旋转,扭曲,倾斜,缩放,所有类型的复制图像的工作。再次,这只是一个想法,我有,它不是基于任何研究的学术界,据我所知,(我还没有看很难虽然),所以它可能是很多你有限的/没有结果的工作。

This technique could work well for rotated, distorted, skewed, zoomed, all types of copied images. Again this is just an idea I have had, it's not based on any researched academia as far as I am aware (I haven't look hard though), so it may be a lot of work for you with limited/no results.

荏苒

在这个问题嗷的答案是优秀的,我记得曾读到这些类型的技术研究AI。它是在比较语料库的词典相当有效。

Ow's answer in this question is excellent, I remember reading about these sort of techniques studying AI. It is quite effective at comparing corpus lexicons.

比较语料时,一个有趣的优化是,你可以删除被认为是太常见的单词,例如将,A,与等这些词冲淡我们的结果,我们要制定出如何不同的2语料库是如此这些可以被处理之前除去。或许有可能之前COM pression被剥夺在图片类似的公共信号?它可能是值得研究的。

One interesting optimisation when comparing corpuses is that you can remove words considered to be too common, for example 'The', 'A', 'And' etc. These words dilute our result, we want to work out how different the two corpus are so these can be removed before processing. Perhaps there are similar common signals in images that could be stripped before compression? It might be worth looking into.

的COM pression比例确定类似的两组数据是如何的一个非常快速,合理有效的方式。阅读了有关如何COM pression工作会给你一个好主意,这可能是为什么那么有效。对于一个快速释放的算法,这将可能是一个很好的起点。

Compression ratio is a very quick and reasonably effective way of determining how similar two sets of data are. Reading up about how compression works will give you a good idea why this could be so effective. For a fast to release algorithm this would probably be a good starting point.

透明度

同样,我不知道该如何透明的数据存储对于某些图像类型,GIF,PNG等,但这将是提取并作为一种有效的简化切出你的数据进行比较设置透明度。

Again I am unsure how transparency data is stored for certain image types, gif png etc, but this will be extractable and would serve as an effective simplified cut out to compare with your data sets transparency.

反转信号

这是图象只是一个信号。如果从一个扬声器起到隔音,且要在完全相同的音量播放的完美同步另一位发言者相反的噪音,他们互相抵消。

An image is just a signal. If you play a noise from a speaker, and you play the opposite noise in another speaker in perfect sync at the exact same volume, they cancel each other out.

倒置上的图像,并将其添加到您的其他图像。缩放/循环位置重复,直到找到一个生成的图像,其中像素不够是白色(或黑色的吗?我将把它作为一个中立的帆布),为您提供一个积极的匹配,或部分匹配。

Invert on of the images, and add it onto your other image. Scale it/loop positions repetitively until you find a resulting image where enough of the pixels are white (or black? I'll refer to it as a neutral canvas) to provide you with a positive match, or partial match.

不过,考虑两个图像,都是平等的,但其中一个有应用了提亮的效果:

However, consider two images that are equal, except one of them has a brighten effect applied to it:

反相其中之一,然后将它添加到其他的不会导致中性帆布这就是我们的目标。然而,从两个原始图像进行比较的象素时,我们可以看到definatly两者之间有明显的关系。

Inverting one of them, then adding it to the other will not result in a neutral canvas which is what we are aiming for. However, when comparing the pixels from both original images, we can definatly see a clear relationship between the two.

我没有研究过的颜色有些年头了,现在,我不能确定,如果色彩频谱是线性刻度,但如果你确定这两个图像之间的色差的平均因子,你可以使用这个值正常化数据之前,这种技术处理。

I haven't studied colour for some years now, and am unsure if the colour spectrum is on a linear scale, but if you determined the average factor of colour difference between both pictures, you can use this value to normalise the data before processing with this technique.

树数据结构

起初,这似乎不适合的问题,但我认为他们可以工作。

At first these don't seem to fit for the problem, but I think they could work.

您可以考虑提取图像的某些特性(例如颜色箱),并生成一个哈夫曼树或类似数据结构。你也许可以比较的相似性两棵树。照相数据与颜色的大范围,这将无法正常工作,例如,而是卡通或其他小的彩色图像集这可能工作。

You could think about extracting certain properties of an image (for example colour bins) and generate a huffman tree or similar data structure. You might be able to compare two trees for similarity. This wouldn't work well for photographic data for example with a large spectrum of colour, but cartoons or other reduced colour set images this might work.

这可能是行不通的,但它是一个想法。该特里数据结构是伟大的存储词汇,例如dictionarty。这是一个preFIX树。也许有可能建立一个图像相当于一个词汇,(再次我只能想到的颜色)来构造一个线索。如果你减少说300×300图像分成5x5的方格,然后分解每一个5x5的正方形分成一定的颜色顺序,你可以从得到的数据构建一个线索。如果一个2x2正方形包括:

This probably wouldn't work, but it's an idea. The trie datastructure is great at storing lexicons, for example a dictionarty. It's a prefix tree. Perhaps it's possible to build an image equivalent of a lexicon, (again I can only think of colours) to construct a trie. If you reduced say a 300x300 image into 5x5 squares, then decompose each 5x5 square into a sequence of colours you could construct a trie from the resulting data. If a 2x2 square contains:

FFFFFF|000000|FDFD44|FFFFFF

我们有一个相当独特的线索code延伸24级,增加/减少的水平(即减少/增加我们的子方形的大小)可能会产生更准确的结果。

We have a fairly unique trie code that extends 24 levels, increasing/decreasing the levels (IE reducing/increasing the size of our sub square) may yield more accurate results.

比较特里树应该是相当容易,并能可能提供有效的结果。

Comparing trie trees should be reasonably easy, and could possible provide effective results.

更多的想法

我迷迷糊糊翻过一个有趣的纸breif有关的卫星图像 分类,它概述:

I stumbled accross an interesting paper breif about classification of satellite imagery, it outlines:

视为纹理措施是:共生矩阵,灰度差,纹理色调分析,特性从傅立叶频谱导出,Gabor滤波器。一些傅立叶特征和一些Gabor滤波器被认为是很好的选择,特别是当单个频带用于分类

Texture measures considered are: cooccurrence matrices, gray-level differences, texture-tone analysis, features derived from the Fourier spectrum, and Gabor filters. Some Fourier features and some Gabor filters were found to be good choices, in particular when a single frequency band was used for classification.

这可能是值得研究的更详细的测量,虽然其中一些可能不适合您的数据集。

It may be worth investigating those measurements in more detail, although some of them may not be relevant to your data set.

其他的事情要考虑

有可能对这样的事情一大堆文件,所以读了一些人应该帮助他们虽然是非常技术性的。这是一个非常困难的领域的计算,与许多徒劳小时的工作,许多人试图做类似的事情花了。保持简单和建筑在这些想法将是最好的一段路要走。这应该是一个相当艰巨的挑战与创建算法比随机匹配率更好,并开始改进上确实开始变得非常难以实现。

There are probably a lot of papers on this sort of thing, so reading some of them should help although they can be very technical. It is an extremely difficult area in computing, with many fruitless hours of work spent by many people attempting to do similar things. Keeping it simple and building upon those ideas would be the best way to go. It should be a reasonably difficult challenge to create an algorithm with a better than random match rate, and to start improving on that really does start to get quite hard to achieve.

每个方法很可能需要进行测试和调整充分,如果您有任何关于图片你会被检查,以及类型的任何信息,这将是有益的。例如广告,很多人会在他们的文字,这样文字识别是找到比赛尤其是与其它解决方案相结合的一个简单而可能非常可靠的方法。正如前面提到的,尝试利用你的数据集的公共属性。

Each method would probably need to be tested and tweaked thoroughly, if you have any information about the type of picture you will be checking as well, this would be useful. For example advertisements, many of them would have text in them, so doing text recognition would be an easy and probably very reliable way of finding matches especially when combined with other solutions. As mentioned earlier, attempt to exploit common properties of your data set.

组合替代的测量和技术,每个可以有一个加权投票(取决于其有效性)将是你可以创建一个系统,产生更精确的结果的一种方式。

Combining alternative measurements and techniques each that can have a weighted vote (dependant on their effectiveness) would be one way you could create a system that generates more accurate results.

如果采用多种算法,正如在这个答案的开始时,人们可能会发现所有的肯定,但有20%的假阳性率,这将有兴趣来研究其他算法的性能/优势/劣势另一个算法可以是有效地消除来自另一返回假阳性

If employing multiple algorithms, as mentioned at the begining of this answer, one may find all the positives but have a false positive rate of 20%, it would be of interest to study the properties/strengths/weaknesses of other algorithms as another algorithm may be effective in eliminating false positives returned from another.

小心不要陷入试图完成永不落幕的项目,祝你好运!

Be careful to not fall into attempting to complete the never ending project, good luck!

这篇关于算法来比较两个图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆