相似图片 - 如何比较它们 [英] Similar images - how to compare them

查看：180 发布时间：2018/7/25 16:22:17 php image image-processing similarity fingerprint

本文介绍了相似图片 - 如何比较它们的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有超过130万的图像，我必须相互比较，每天增加几百个。

I have over 1.3milion images that I have to compare with each other, and a few hundreds per day are added.

我的公司拍摄图像并创建一个我们的供应商可以使用的版本。

My company take an image and create a version that can be utilized by our vendors.

这些文件通常非常相似，例如两个不同的公司可以向我们发送两个不同的图像，JPG和GIF，两者都带有麦当劳标志，提交时间间隔数月。

The files are often very similar to each other, for example two different companies can send us two different images, a JPG and a GIF, both with the McDonald Logo, with months between the submissions.

最后我们发现自己创造了两个不同时间的相同标识只需复制/粘贴已创建的一个或至少建议它作为艺术家的一个可能的起点。

What is happening is that at the end we find ourselves creating two different times the same logo when we could simply copy/paste the already created one or at least suggest it as a possible starting point for the artists.

我已经四处寻找创建指纹的算法或者将允许我在上传新图像时进行简单查询，时间相对不是问题，如果需要1秒钟来创建指纹，创建指纹需要150天，但这将是一个很大的节省我们甚至可以得到3或4台服务器。

I have looked around for algorithms to create a fingerprint or something that will allow me to do a simple query when a new image is uploaded, time is relatively not an issues, if it takes 1 second to create the fingerprint it will take 150 days to create the fingerprints but it will be a great deal in saving that we might even get 3 or 4 servers to do it.

我精通PHP，但如果算法是伪代码甚至CI可以读取它并尝试翻译（除非它使用一些C特定的库）

I am fluent in PHP, but if the algorithm is in pseudocode or even C I can read it and try to translate (unless it uses some C specific libraries)

目前我正在做所有图像的MD5来捕捉那些完全相同的图像，这个问题出现了我想要调整图像的大小并在调整大小的图像上运行md5以捕获以不同格式保存并重新调整大小的图像，但之后我仍然没有足够好的识别。

Currently I am doing an MD5 of all the images to catch the ones that are exactly the same, this question came up when I was thinking to do a resize of the image and run the md5 on the resized image to catch the ones that have been saved in a different format and resized, but then I would still not have a good enough recognition.

如果我没有提及它，我会很高兴看到可能出现类似图像的东西。

If I didn't mention it, I will be happy with something that just suggest possible "similar" images.

编辑

请记住，检查需要每分钟进行多次，因此最佳解决方案是为每个图像提供一些值我可以存储和使用，以便与我正在查看的图像进行比较，而无需重新扫描整个ser ver。

Keep in mind that the check needs to be done multiple times per minute, so the best solution is one that gives me some values per image that I can store and use in the future to compare with the image that I am looking at without having to re-scan the whole server.

我正在阅读一些提到直方图的页面，或者将图像调整到非常小的尺寸，剥离可能的标签，然后将其转换为灰度，做哈希该文件并用于比较。如果我成功了，我会在这里发布代码/答案

I am reading some pages that mention histograms, or resizing the image to a very small size, strip possible tags and then convert it in grayscale, do the hash of that files and use it for comparison. If I am succesful I will post the code/answer here

相似图片 - 如何比较它们 [英] Similar images - how to compare them

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

相似图片 - 如何比较它们 [英] Similar images - how to compare them

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭