查找目录,但排除列表,其中目录名称中有空格 [英] find directories but exclude list where directories have a space in name

查看:98
本文介绍了查找目录,但排除列表,其中目录名称中有空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个过程,可以在大型文件系统上审核从一天到另一天的文件.我想通过使用要排除的目录列表来排除某些目录.我可以做的很好,但是如果排除目录的名称中有空格,我会遇到麻烦.

I have a process that audits files from one day to the next on a large file system. I want to exclude some directories from consideration by using a list of directories to exclude. I can do that just fine, but I'm having trouble if an exclude directory has a space in the name.

为简单起见,我只列出四个子目录,但实际上我想搜索还是排除许多目录.也有可能添加了新目录,并且我想自动包含新目录,因此排除列表与使用包含列表.

For simplicity's sake, I'm only going to list four sub-directories, but in reality there are many more directories I want to search vs exclude. There's also the chance that a new directory gets added and I want to automatically include new directories, hence the exclude list vs using an include list.

base_dir/
├── sub_dir1
├── sub_dir2
├── sub dir3
└── sub_dir4

我有一个shell脚本和一个排除列表

I have a shell script and an exclude list

$ cat exclude.txt
sub_dir2
sub dir3

shell脚本使用findprintf以及awk和sort来获取要审核的目录列表.

The shell script uses find and printf along with awk and sort to get a list of directories to audit.

$ find ./base_dir -maxdepth 1 -type d $(printf "! -iname %s " $(cat exclude.txt)) | awk -F/ '{print $NF}' | sort
sub_dir1
sub dir3
sub_dir4

正如您可能在上面猜到和看到的那样,除了不忽略sub dir3之外,此方法有效.我试过在排除列表中使用双引号的几种组合,并使用%q vs %s vs %a,但似乎找不到正确的组合.

As you can probably guess and see above, this works except that it's not ignoring sub dir3. I've tried a few combinations of double quotes inside exclude list and using %q vs %s vs %a, but can't seem to find the correct combination.

我想要的输出是

sub_dir1
sub_dir4

我意识到我可以做类似的事情:

I realize I could do something like:

find ./base_dir -maxdepth 1 -type d \
    ! -iname "sub dir3" $(printf "! -iname %s " $(cat exclude.txt)) \
    | awk -F/ '{print $NF}' | sort

并获得期望的输出,但是我只想使用exclude.txt列表.

and get my expected output, but I want to only use the exclude.txt list.

EDIT 在阅读了一些回复之后,我尝试使用数组并认为该方法行得通,但现在对于这个选项为何行不通的我更加困惑.如果我严格地在命令行中键入它,printf似乎会产生一个字符串,但是当尝试将其作为单行代码运行时,仍然会给我错误.

EDIT After reading some replies I tried using an array and thought that would work, now it's even more obscure to me why this option doesn't work. printf appears to produce a string that would work if I strictly typed it into the command line, but when trying to run it as a one-liner still giving me errors.

$cat exclude.txt
base_dir
sub_dir2
"sub dir3"

$ mapfile -t exclude < exclude.txt

$printf "! -iname %s " "${exclude[@]}"
! -iname base_dir ! -iname sub_dir2 ! -iname "sub dir3"

$find ./base_dir -maxdepth 1 -type d $(printf "! -iname %s " "${exclude[@]}")
find: paths must precede expression: dir3"

$ find ./base_dir -maxdepth 1 -type d ! -iname base_dir ! -iname sub_dir2 ! -iname "sub dir3"
./base_dir/sub_dir1
./base_dir/sub_dir4

推荐答案

您可以将排除文件读入Bash数组,然后编写如下的find命令:

You could read the exclude file into a Bash array and then craft a find command like this:

mapfile -t exclude < exclude.txt
find ./base_dir \
    -mindepth 1 \          # Exclude the current directory
    -type d \
    -regextype egrep \     # Make sure alternation "|" does not have to be escaped
    ! -iregex ".*/($(IFS='|'; echo "${exclude[*]}"))" \
    -printf '%f\n'         # Print just filename without leading directories

导致

sub_dir1
sub_dir4

对于您的示例输入,-iregex测试扩展如下:

For your example input, the -iregex test expands like this:

$ IFS='|'
$ echo "${exclude[*]}")
sub_dir2|sub dir3

因此要排除的路径的正则表达式变为

so the regular expression for paths to exclude becomes

.*/(sub_dir2|sub dir3)

IFS的更改仅限于命令替换.

The change to IFS is limited to the command substitution.

对此的限制是,如果要排除的目录包含正则表达式专用的字符,则必须转义这些字符,否则可能会造成混乱.如果您想逃脱,例如管道,则可以使用

The limitation to this is if the directories to be excluded contain characters that are special to regexes, you have to escape those, which can get messy. If you wanted to escape, for example, pipes, you could use

echo "${exclude[*]//|/\\|}"

在命令替换中,导致

sub_dir2|sub dir3|has\|pipe

其中名称为|的目录has|pipe的管道已正确转义.

where the directory has|pipe with a | in its name has its pipe properly escaped.

这篇关于查找目录,但排除列表,其中目录名称中有空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆