在C#中比较两个图像的算法 [英] Algorithm to compare two images in C#
本文介绍了在C#中比较两个图像的算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在用C#编写一个工具来查找重复的图像。目前我创建了文件的MD5校验和并对其进行比较。
不幸的是,我的图片可以是
- 旋转90度
- 具有不同的维度(小图像具有相同的内容)
- 具有不同的压缩或文件类型(例如jpeg文物,见下文)
解决这个问题的最佳方法是什么问题?
解决方案
这里有一个简单的方法,用256位图像哈希(MD5有128位) b
$ b
- 将图片大小调整为<strong> 16x16 像素
- 减少颜色为黑色 / 白色 (在此控制台输出中等于 true / false )
GetPixel
以获得更多表现
0.5f
在光线和黑暗之间有所不同 - 使用全部256个像素的不同的中值亮度。否则,暗/暗图像被假定为相同,并且它可以检测亮度发生变化的图像。 bool []
或 List< bool>
if你需要存储大量哈希以节省内存,使用 Bitarray
,因为布尔值不会存储在一个位中,它需要一个 byte !I'm writing a tool in C# to find duplicate images. Currently i create a MD5 checksum of the files and compare those.
Unfortunately my images can be
- rotated by 90 degrees
- have different dimensions (smaller image with same content)
- have different compressions or filetypes (e.g. jpeg artifacts, see below)
what would be the best approach to solve this problem?
解决方案
Here is a simple approach with a 256 bit image-hash (MD5 has 128 bit)
- resize the picture to 16x16 pixel
- reduce colors to black/white (which equals true/false in this console output)
- read the boolean values into
List<bool>
- this is the hash
Code:
public static List<bool> GetHash(Bitmap bmpSource)
{
List<bool> lResult = new List<bool>();
//create new image with 16x16 pixel
Bitmap bmpMin = new Bitmap(bmpSource, new Size(16, 16));
for (int j = 0; j < bmpMin.Height; j++)
{
for (int i = 0; i < bmpMin.Width; i++)
{
//reduce colors to true / false
lResult.Add(bmpMin.GetPixel(i, j).GetBrightness() < 0.5f);
}
}
return lResult;
}
I know, GetPixel
is not that fast but on a 16x16 pixel image it should not be the bottleneck.
- compare this hash to hash values from other images and add a tolerance.(number of pixels that can differ from the other hash)
Code:
List<bool> iHash1 = GetHash(new Bitmap(@"C:\mykoala1.jpg"));
List<bool> iHash2 = GetHash(new Bitmap(@"C:\mykoala2.jpg"));
//determine the number of equal pixel (x of 256)
int equalElements = iHash1.Zip(iHash2, (i, j) => i == j).Count(eq => eq);
So this code is able to find equal images with:
- different file formats (e.g. jpg, png, bmp)
- rotation (90, 180, 270), horizontal /vertical flip - by changing the iteration order of
i
andj
- different dimensions (same aspect is required)
- different compression (tolerance is required in case of quality loss like jpeg artifacts) - you can accept a 99% equality to be the same image and 50% to be a different one.
- colored changed to geyscaled and the other way round (because brightness is independent of color)
Update / Improvements:
after using this method for a while I noticed a few improvements that can be done
- replacing
GetPixel
for more performance - using the exeif-thumbnail instead of reading the whole image for a performance improvement
- instead of setting
0.5f
to differ between light and dark - use the distinct median brightness of all 256 pixels. Otherwise dark/light images are assumed to be the same and it enables to detect images which have a changed brightness. - if you need fast calculations, use
bool[]
orList<bool>
if you need to store a lot hashes with the need to save memory, use aBitarray
because a Boolean isn't stored in a bit, it takes a byte!
这篇关于在C#中比较两个图像的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文