壳牌：在列表中查找文件 [英] Shell: find files in a list

查看：87 发布时间：2016/8/2 13:49:06 linux bash shell

本文介绍了壳牌：在列表中查找文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含约1000个文件名列表。我想一个根目录下，这些文件的路径。有很多子目录中的目录，并在子目录超过1,000,000的文件。下面的命令将确实发现1000次：

I have a list containing about 1000 file names. I want to path of these files under a base directory. There're many subdirs in the dir, and more than 1,000,000 files in the subdirs. The following command will do find for 1000 times:

cat filelist.txt | while read f; do find /dir -name $f; done

是否有更快，更好的方式来做到这一点？

Is there faster or better way to do it?

推荐答案

如果 Filelist.txt中每行有一个单一的文件名：

If filelist.txt has a single filename per line:

find /dir | grep -f <(sed 's@^@/@; s/$/$/; s/\([\.[\*]\|\]\)/\\\1/g' filelist.txt)

（即 -f 选项意味着grep的搜索给定文件中的所有模式。）

(The -f option means that grep searches for all the patterns in the given file.)

＆LT的的说明;（SED的@ ^ @ / @; S / $ / $ /; S / \\（[。\\ [\\ *] \\ | \\] \\） / \\\\\\ 1 / G'Filelist.txt中）：

Explanation of <(sed 's@^@/@; s/$/$/; s/$[\.[\*]\|\]$/\\\1/g' filelist.txt):

的≤（...）被称为的过程subsitution 和有点类似于 $（...）。这种情况等同于（但使用过程中替换的整洁，并有可能更快一点）：

The <( ... ) is called a process subsitution, and is a little similar to $( ... ). The situation is equivalent to (but using the process substitution is neater and possibly a little faster):

sed 's@^@/@; s/$/$/; s/\([\.[\*]\|\]\)/\\\1/g' filelist.txt > processed_filelist.txt
find /dir | grep -f processed_filelist.txt

到 SED 调用运行命令取值@ ^ @ / @ ，取值/ $ / $ / 和 S / \\（[\\ [\\ *] \\ | \\] \\）/ \\\\\\ 1 / G 在 Filelist.txt中并打印出来的每一行。这些命令转换成文件名，将使用grep更好地工作的格式。

The call to sed runs the commands s@^@/@, s/$/$/ and s/$[\.[\*]\|\]$/\\\1/g on each line of filelist.txt and prints them out. These commands convert the filenames into a format that will work better with grep.

取值@ ^ @ / @ 表示每个文件名前加上一个 / 在。（在 ^ 的正则表达式的意思是行首）

S / $ / $ / 表示把 $ 在每个文件名的末尾。（第一个 $ 表示行结束，二是只是一个文字 $ 然后将其除$ P通过grep的$ PTED的意思是行结束）。

s@^@/@ means put a / at the before each filename. (The ^ means "start of line" in a regex)
s/$/$/ means put a $ at the end of each filename. (The first $ means "end of line", the second is just a literal $ which is then interpreted by grep to mean "end of line").

这两个规则的结合意味着grep的只会看像 ... /＆LT匹配;文件名＆GT; ，使 A。 TXT 不匹配 ./ a.txt.backup 或 ./ abba.txt 。

The combination of these two rules means that grep will only look for matches like .../<filename>, so that a.txt doesn't match ./a.txt.backup or ./abba.txt.

S / \\（[\\ [\\ *] \\ | \\] \\）/ \\\\\\ 1 / G 将一个 \\ 。 [ ] 或 * 。 grep所使用的正则表达式和这些字符被认为是特殊的，但我们希望他们是普通的，所以我们需要躲避他们（如果我们没有逃避他们，那么文件名如 A.TXT 将匹配像文件 abtxt ）。


s/\([\.[\*]\|\]\)/\\\1/g puts a \ before each occurrence of . [ ] or *. Grep uses regexes and those characters are considered special, but we want them to be plain so we need to escape them (if we didn't escape them, then a file name like a.txt would match files like abtxt).
作为一个例子：
$ cat filelist.txt
file1.txt
file2.txt
blah[2012].txt
blah[2011].txt
lastfile

$ sed 's@^@/@; s/$/$/; s/\([\.[\*]\|\]\)/\\\1/g' filelist.txt
/file1\.txt$
/file2\.txt$
/blah\[2012\]\.txt$
/blah\[2011\]\.txt$
/lastfile$

 grep的然后使用该输出的每一行，当它正在搜索的输出发现的模式。

                        这篇关于壳牌：在列表中查找文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

壳牌：在列表中查找文件 [英] Shell: find files in a list

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

壳牌：在列表中查找文件 [英] Shell: find files in a list

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭