如何查找所有没有相同名称但扩展名不同的匹配文件的文件 [英] how to find all files that dont have a matching file with the same name but different extension

查看:32
本文介绍了如何查找所有没有相同名称但扩展名不同的匹配文件的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含超过 100 万个文件的文件夹.这些文件成对出现,只是扩展名不同(例如 a1.ext1 a1.ext2、a2.ext1、a2.ext2 ...)

i have a folder with over 1 million files. the files come in couples that only differ by their extension (e.g. a1.ext1 a1.ext2, a2.ext1, a2.ext2 ...)

我需要扫描这个文件夹并确保它满足这个要求(文件耦合),如果我发现一个没有匹配的文件,我应该删除它.

i need to scan this folder and make sure that it fulfills this requirement (of file coupling), and if i find a file without its match i should delete it.

我已经用 python 完成了,但是在处理 7 位数的文件时速度非常慢..

i've already done it in python, but it was super slow when it came to working with the 7-figure number of files..

有没有办法使用 shell 命令/脚本来做到这一点?

is there a way to do this using a shell command/script?

推荐答案

基于另一个答案,您可以使用这样的脚本(它应该在文件所在的同一目录中,并且应该在那里执行):

Building on another answer, you could use script like this (it is supposed to be in the same directory where files are located, and should be executed there):

#!/usr/bin/env bash 
THRASH=../THRASH
mkdir "$THRASH" 2> /dev/null

for name in $(ls *.{ext1,ext2} | cut -d. -f1 | sort -u); do
    if [ $(ls "$name".{ext1,ext2} 2> /dev/null | wc -w) -lt 2 ]; then
        mv "$name".{ext1,ext2} "$THRASH" 2> /dev/null
    fi;
done

您可以通过修改 THRASH 变量来配置将没有配对的文件移动到何处.

You can configure where to move files that doesn't have their pair by modifying THRASH variable.

在具有 3.0 GHz 和 2 GB RAM 的双核 Pentium 上,一次运行需要 63.7 秒(10000 对,文件夹中每对成员约有 1500 对).

On dual core Pentium with 3.0 GHz and 2 GB of RAM one run took 63.7 seconds (10000 pairs, with about 1500 of each member of the pair missing from the folder).

这篇关于如何查找所有没有相同名称但扩展名不同的匹配文件的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆