在两个目录中搜索具有不同文件名的类似图像 [英] Search for similar images with different filenames in two directories

查看:116
本文介绍了在两个目录中搜索具有不同文件名的类似图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个包含大量图片的目录,例如: color / gray / 。在 color / 中,图像被命名为:image1.png image2.png等。



我知道 gray / 包含相同的图像,但以灰度显示,文件名和文件顺序不同(例如:file_01.png,但这不是image1.png的图像)。

是否可以使用 gray / 来比较两个目录中的图像并将 color / 文件复制到结果 em>文件名?

示例:

$ c> directory |目录|目录
color /| gray /| results /
(彩色图像)| (灰度图像)| (带有灰度名称的彩色图像)
----------------- + ------------------- - + ----------------------------------------
color / image1 .png | grey / file324.png | results / file324.png(in color:==>
| this和image1.png是相同的图片)

我希望这不是很混乱,但我不知道如何更好地解释它。



我试过用imagemagick,看起来-compare选项可以用于这个,但是我无法制作bash脚本或者做得很好的东西。



另一种说法是:我希望将所有 color / *。jpg 复制到 results / *。jpg 使用正确匹配的 grey / *。jpg 名称。



编辑(一些注释):
1.这三张图片的大小和内容都是相同的。唯一的区别是两个是彩色的,一个是灰色的。当然,这些文件的名称。
2.我上传了一个zip文件,其中包含一个样本图像及其当前名称(文件夹img1颜色文件夹和文件夹img2 灰度文件夹)和预期结果(img3结果文件夹),在这里: http://www.mediafire.com/?9ug944v6h7t3ya8

解决方案

如果我正确理解了需求,我们需要:为每个找到


  • 名为XYZ的灰度图像位于文件夹 gray / 中...
  • ...
  • ...匹配的颜色图像位于文件夹 color / 和...
  • 中的ABC ...
  • ...将新名称XYZ下的ABC复制到文件夹 results /



所以我建议的基本算法是这样的:


  1. 将文件夹 color / 中的所有图像转换为灰度并将结果存储到文件夹 gray-reference / 中。保留原名:

      mkdir灰色参考
    convert color / img123.jpg -colorspace灰色参考/ img123.jpg


  2. 对于 reference / 中的每个灰度图像,与文件夹 gray / 中的每个灰度图像进行比较。如果找到匹配项,请将相同名称的相应图像从 color / 复制到 results / 。创建差异视觉表示的一个可能的比较命令是:

      compare gray-reference / img123.jpg gray / imgABC。 jpg -compose src delta.jpg 


真实诀窍是两幅灰度图像的比较(如步骤2)。 ImageMagick有一个方便的命令来逐个比较两个(相似)图像,并将结果写入'delta'图像:比较reference.png test.png -compose src delta.png

如果比较是彩色图像,在增量图像中...




  • ...每个相等的像素都以白色显示,而。 ..
  • ...
  • ...每个不同的像素都以高亮颜色显示(默认为红色)。


另请参阅我的答案ImageMagick:'比较图片如果我们直接将灰色图像与彩色图像逐像素地进行比较,那么我们当然会发现几乎每个像素都不相同(导致全红delta图片)。因此,我建议从第1步开始,先将彩色图像转换为灰度。



如果我们比较两个灰度图像,则生成的增量图像也是灰度图。因此默认的高亮颜色不能为红色。我们最好将它设置为黑色,以便更好地观察它。



现在,如果我们当前的颜色灰度转换会导致不同即现有灰度图像所具有的灰度图像(由于应用了不同的颜色配置文件,我们目前生成的灰度可能比现有灰度图像稍微更亮或更暗),但仍可能发生我们的增量图像全部是红色,或者说是全部高亮的颜色。然而,我用你的样品图像测试了这个,结果很好: gray.jpg
比较\
gray / file324.jpg \
image1-gray.jpg \
-highlight-color black \
-compose src \
delta.jpg

delta.jpg 包含98%的白色像素。我不确定您的数千个灰度图像中的其他所有图像是否都是从彩色原稿派生的,都使用相同的设置。因此,我们在运行 compare 命令时添加了一个小的 fuzz 因子,比较2个像素时的颜色偏差。

 比较-fuzz 3%reference.png test.png -compose src delta.png 

由于这个算法要执行好几千次(可能是几百万次,考虑到你谈论的图像的数量),我们应该做一些性能考虑因素,我们应该计算比较命令的持续时间。这是尤其值得关注的问题,因为您的示例图片相当大(3072x2048像素 - 6百万像素),并且比较可能需要一段时间。我的时间结果在MacBook Pro中,这些:

  time(convert color / image1.jpg -colorspace gray image1-gray.jpg; 
比较\
gray / file324.jpg \
image1-gray.jpg \
-highlight-color black \
-fuzz 3%\
-compose src \
delta100-fuzz.jpg)

real 0m6.085s
用户0m2.616s
sys 0m0.598s

6秒用于:1将大型彩色图像转换为灰度图,再加上两张大型灰度图像的比较。 >

您谈到了数千张图片。假设基于这个时间的3000张图像,所有图像的处理将需要(3000 * 3000)/ 2 比较(450万)和( 3000 * 3000 * 6)/ 2 秒(2700万秒)。总共需要312天才能完成所有比较。太长了,如果你问我。



我们可以做些什么来提高表现?



好吧,我的第一个想法是减少图像的大小。如果我们比较较小的图像而不是3072x2048大小的图像,则比较应该更快地返回结果。 (但是,我们也会花费额外的时间来首次缩放我们的测试图像 - 但希望比我们稍后保存时比比较较小的图像少得多的时间:

  time(convert color / image1.jpg -colorspace gray -scale 6.25%image1-gray.jpg; 
convert gray / file324.jpg -scale 6.25% file324-gray.jpg;
比较\
file324-gray.jpg \
image1-gray.jpg \
-highlight-color black \
-fuzz 3%\
-compose src \
delta6.25-fuzz.jpg)

real 0m0.670s
user 0m0.584s
sys 0m0.074s

好多了!我们削减了将近90%的处理时间,希望能在35天内完成这项工作,如果你使用的是MacBook Pro。



改善只是合乎逻辑的:通过减少图像毛钱nsion到原始图像的6.25%,结果图像仅为192x128像素 - 从600万像素减少到24500个像素,比例为256:1。 -thumbnail resize 参数的工作速度比 -scale 要快一些,但是这种速度增加是对质量损失的折衷。可能会使比较更不可靠...)



从比较的图像中创建视觉可检查的三角洲图像,我们可以告诉ImageMagick输出一些统计信息。为了获得不同像素的数量,我们可以使用 AE 度量。命令的结果是这样的:

  time(convert color / image1.jpg -colorspace gray -scale 6.25%image1-grey .jpg; 
convert gray / file324.jpg -scale 6.25%file324-gray.jpg;
compare -metric AE file324-gray.jpg image1-gray.jpg -fuzz 3%null:2>& amp ; 1)
0

real 0m0.640s
user 0m0.574s
sys 0m0.073s
0 不同的像素 - 这是我们可以直接在shell脚本中使用的结果!



/ p>

Shell脚本的构建块



因此,这里有一个shell脚本的构建块可以进行自动比较:


  1. 将彩色图像从'color /'目录转换为灰度图像,将其缩小至6.25%并保存结果在'reference-color /'目录中:

     #转换1000张尺寸为3072x2048的图片所需的估计时间: 
    #500秒
    mkdir参考颜色
    为彩色/ *。jpg;
    转换为$ {i} - colorspace gray -scale 6.25%reference-color / $(basename$ {i})
    done


  2. 从'gray /'目录缩小图像并将结果保存到'reference-grey /'目录中:

     #转换1000张尺寸为3072x2048的图片所需的估计时间:
    #250秒
    mkdir reference-gray
    为灰色/ *。jpg;
    转换$ {i} - 缩放6.25%reference-grey / $(basename$ {i})
    完成
  3. 将目录'reference-gray /'中的每个图像与目录'reference-color'中的图像进行比较,直至找到匹配:

     #将1张图片与1000张图片进行比较所需的预计时间:
    #300秒
    #如果我们有1000张图片,我们需要进行总共1000 * 1000/2
    #比较来查找所有匹配;
    #也就是说,我们需要大约2天才能完成所有工作。
    #如果我们有3000张图片,我们需要总共3000 * 3000/2的比较
    #来查找所有匹配;
    #这需要大约20天。

    for reference-gray / *。jpg;在reference-color / *。jpg中为我做

    ;做

    #比较两个灰度参考图像
    if [x0==x $(compare -metric AE$ {i}$ {j}-fuzz 3 %null:2>& 1)];那么

    #如果我们找到了一个匹配项,那么以所需的名称创建副本
    cp color / $(basename$ {j}results / $(basename$ {i} );

    #如果我们找到了一个匹配,然后删除相应的参考图像(我们不想再次与这个图像进行比较)
    rm -rf$ {i}

    #如果我们找到了一个匹配,从这个循环中断开并开始下一个
    break;

    fi

    完成

    完成


警告:不要盲目依赖这些积木,它们没有经过测试,我没有一个可用于测试的合适图像目录,我不想仅为此练习创建一个。谨慎!


I have 2 directories with lots and lots of images, say: color/ and gray/. In color/ images are named: image1.png image2.png, etc.

I know that gray/ contains the same images, but in gray-scale, and the file names and order of files is different (eg: file_01.png, but this IS NOT the same image as image1.png).

Is it possible to make a comparison of images in both directories and copy color/ files to a results/ directory with gray/ file names?

Example:

directory        | directory           | directory
   "color/"      |     "gray/"         |      "results/" 
(color images)   | (grayscale images)  | (color images with gray-scale names)   
-----------------+---------------------+----------------------------------------
color/image1.png | gray/file324.png    | results/file324.png  (in color: ==>
                                       | this and image1.png are the same image)

I hope this is not very confusing, but I don't know how to explain it better.

I have tried with imagemagick, and it seems that the -compare option could work for this, but I'm unable to make a bash script or something that does it well.

Another way to say it: I want all color/*.jpg copied into the results/*.jpg folder using the correctly matching gray/*.jpg names.

EDIT (some notes): 1. The three images are IDENTICAL in size and content. The only difference is that two are in color and one is in gray-scale. And the name of the files, of course. 2. I uploaded a zip file with one sample image with their current names (folder "img1" is the color folder and folder "img2" is the grayscale folder) and the expected result ("img3" is the results folder), here: http://www.mediafire.com/?9ug944v6h7t3ya8

解决方案

If I understood the requirement correctly, we need to:

  • find for each grayscale image named XYZ that is in folder gray/...
  • ...the matching color image named ABC that is in folder color/ and...
  • ...copy ABC to folder results/ under the new name XYZ

So the basic algorithm I suggest is this:

  1. Convert all images in folder color/ to grayscale and store result in folder gray-reference/. Keep the original names:

    mkdir gray-reference
    convert  color/img123.jpg  -colorspace gray  gray-reference/img123.jpg
    

  2. For each grayscale image in reference/ make a comparison with each grayscale image in folder gray/. If you find a match, copy the respective image of the same name from color/ to results/. One possible comparison command which creates a visual representation of differences is this:

    compare  gray-reference/img123.jpg  gray/imgABC.jpg  -compose src delta.jpg
    

The real trick is the comparison (as in step 2) of the two grayscale images. ImageMagick has a handy command to compare two (similar) images pixel by pixel and write the results into a 'delta' image:

compare  reference.png  test.png  -compose src  delta.png

If the comparison is for color images, in the delta image...

  • ...each pixel that was equal appears in white, while...
  • ...each pixel that was different appears in a highlight color (defaults to red).

See also my answer "ImageMagick: 'Diff' an Image" for an illustrated example of this technique.

If we directly compared a gray image with a color image pixel by pixel we would of course find that almost every single pixel is different (resulting in an all-red "delta" picture). Hence my proposal from step 1 above to first convert the color image to grayscale.

If we compare two grayscale images, the resulting delta image is in grayscale too. Hence the default highlight color can't be red. We better set it to 'black' in order to see it better.

Now if our current grayscale conversion of the color would result in a 'different' sort of grayscale than the one that the existing gray images have (our currently produced grays could just be slightly lighter or darker than the existing grayscale image due to different color profiles having been applied), it could still happen that our delta picture is all-"red", or rather all-highlight-color. However, I tested this with your sample images, and results are good:

 convert  color/image1.jpg  -colorspace gray  image1-gray.jpg  
 compare                  \
    gray/file324.jpg      \
    image1-gray.jpg       \
   -highlight-color black \
   -compose src           \
    delta.jpg

delta.jpg consists of 98% white pixels. I'm not sure if all the others of your thousands of grayscale images used the same settings when they were derived from the color originals. Therefor we add a small fuzz factor when running the compare command, which does allow for some deviation in color when 2 pixels are compared:

compare  -fuzz 3%  reference.png  test.png  -compose src  delta.png

Since this algorithm is to be executed many thousands of times (maybe several millions of times, given the number of images you talk about), we should make some performance considerations and we should time the duration of the compare command. This is especially a concern, since your sample images are rather large (3072x2048 pixels -- 6 Mega-Pixels), and the comparison could take a while.

My timing results on a MacBook Pro where these:

time (convert  color/image1.jpg  -colorspace gray  image1-gray.jpg ;
      compare                   \
         gray/file324.jpg       \
         image1-gray.jpg        \
        -highlight-color black  \
        -fuzz 3%                \
        -compose src            \
         delta100-fuzz.jpg)

  real  0m6.085s
  user  0m2.616s
  sys   0m0.598s

6 seconds for: 1 conversion of a large color image to grayscale, plus 1 comparison of two large grayscale images.

You talked about 'thousands of images'. Assuming 3000 images, based on this timing, the processing of all the images would require (3000*3000)/2 comparisons (4.5 million) and (3000*3000*6)/2 seconds (27 million sec). That's a total of 312 days to complete all comparisons. Too long, if you ask me.

What could we do to improve the performance?

Well, my first idea is to reduce the size of the images. If we compare smaller images instead of 3072x2048 sized ones, the comparison should return the result faster. (However, we will also spend additional time for first scaling down of our test images -- but hopefully much less time than we later save when comparing the smaller images:

time (convert color/image1.jpg  -colorspace gray  -scale 6.25%  image1-gray.jpg  ;
      convert gray/file324.jpg                    -scale 6.25%  file324-gray.jpg ;
      compare                  \
         file324-gray.jpg      \
         image1-gray.jpg       \
        -highlight-color black \
        -fuzz 3%               \
        -compose src           \
         delta6.25-fuzz.jpg)

   real  0m0.670s
   user  0m0.584s
   sys   0m0.074s

That's much better! We shaved off almost 90% of processing time, which gives hope to complete the job in 35 days if you use a MacBook Pro.

The improvement is only logical: by reducing the image dimension to 6.25% of the original the resulting images are only 192x128 pixels -- a reduction from 6 million pixels to 24.5 thousand pixels, a ratio of 256:1.

(NOTE: The -thumbnail and the -resize parameters would work a little bit faster than -scale does. However, this speed increase is a trade-off against quality loss. That quality loss would probably make the comparison much less reliable...)

Instead of creating a visually inspectable delta image from the compared images, we can tell ImageMagick to print out some statistics. To get the number of different pixels, we can use the AE metric. The command with its results is this:

time (convert color/image1.jpg -colorspace gray -scale 6.25% image1-gray.jpg  ;
     convert gray/file324.jpg                   -scale 6.25% file324-gray.jpg ;
     compare -metric AE  file324-gray.jpg image1-gray.jpg -fuzz 3% null: 2>&1 )
0 

  real  0m0.640s
  user  0m0.574s
  sys   0m0.073s

This means we have 0 differing pixels -- a result that we could directly use inside a shell script!

Building blocks for a Shell script

So here are the building blocks for a shell script to do the automatic comparison:

  1. Convert color images from 'color/' directory to grayscale ones, scale them down to 6.25% and save results in 'reference-color/' directory:

    # Estimated time required to convert 1000 images of size 3072x2048:
    #   500 seconds
    mkdir reference-color
    for i in color/*.jpg; do
        convert  "${i}"  -colorspace gray  -scale 6.25%  reference-color/$(basename "${i}")
    done
    

  2. Scale down images from 'gray/' directory and save results in 'reference-gray/' directory:

    # Estimated time required to convert 1000 images of size 3072x2048:
    #    250 seconds
    mkdir reference-gray
    for i in gray/*.jpg; do
        convert  "${i}"  -scale 6.25%  reference-gray/$(basename "${i}")
    done
    

  3. Compare each image from directory 'reference-gray/' with images from directory 'reference-color' until a match is found:

    # Estimated time required to compare 1 image with 1000 images:
    #    300 seconds
    # If we have 1000 images, we need to conduct a total of 1000*1000/2
    # comparisons to find all matches;
    #    that is, we need about 2 days to accomplish all.
    # If we have 3000 images, we need a total of 3000*3000/2 comparisons
    # to find all matches;
    #    this requires about 20 days.
    #
    for i in reference-gray/*.jpg ; do
    
        for i in reference-color/*.jpg ; do
    
            # compare the two grayscale reference images
            if [ "x0" == "x$(compare  -metric AE  "${i}"  "${j}" -fuzz 3%  null: 2>&1)" ]; then
    
                # if we found a match, then create the copy under the required name
                cp color/$(basename "${j}"  results/$(basename "${i}") ;
    
                # if we found a match, then remove the respective reference image (we do not want to compare again with this one)
                rm -rf "${i}"
    
                # if we found a match, break from within this loop and start the next one
                break ;
    
            fi
    
        done
    
    done
    

Caveat: Do not blindly rely on these building blocks. They are untested. I do not have a directory of multiple suitable images available to test this, and I do not want to create one myself just for this exercise. Proceed with caution!

这篇关于在两个目录中搜索具有不同文件名的类似图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆