什么递归地扩展到当前目录中的所有文件? [英] What expands to all files in current directory recursively?
问题描述
我知道 **/*.ext
扩展到与 *.ext
匹配的所有子目录中的所有文件,但是包含所有此类文件的类似扩展是什么?当前目录也是?
这将适用于 Bash 4:
ls -l {,**/}*.ext
为了使双星号 glob 工作,需要设置 globstar
选项(默认:开启):
shopt -s globstar
来自man bash
:
现在我想知道 globstar 处理中是否曾经有过错误,因为现在只使用 ls **/*.ext
我得到了正确的结果.
无论如何,我查看了 kenorb 使用 VLC 存储库所做的分析,发现该分析和我的直接在上面回答:
与 find
命令输出的比较无效,因为指定 -type f
不包括其他文件类型(特别是目录)和 ls
列出的命令可能会这样做.此外,列出的命令之一,ls -1 {,**/}*.*
- 这似乎是基于我上面的,只输出名称包含一个点 用于子目录中的文件.OP 的问题和我的回答包含一个点,因为要查找的是具有特定扩展名的文件.
然而,最重要的是,使用带有 globstar 模式 **
的 ls
命令存在一个特殊问题.由于模式被 Bash 扩展到被检查树中的所有文件名(和目录名),因此出现了许多重复项.在扩展之后,ls
命令列出了每一个以及它们的内容(如果它们是目录).
示例:
在我们当前目录中是子目录A
及其内容:
A└── AB└── ABC├── ABC1├── ABC2└── ABCD└── ABCD1
在那棵树中,**
扩展为AA/AB A/AB/ABC A/AB/ABC/ABC1 A/AB/ABC/ABC2 A/AB/ABC/ABCD A/AB/ABC/ABCD/ABCD1"(7 个条目).如果你执行 echo **
这就是你得到的确切输出,并且每个条目都表示一次.但是,如果你执行ls **
,它会输出每个的列表> 那些条目.所以本质上它是 ls A
后跟 ls A/AB
等,所以 A/AB
会显示两次.此外,ls
会将每个子目录的输出分开:
<代码>...<空行>目录名称:内容项内容项
因此,使用 wc -l
会计算所有这些空行和目录名称部分标题,这会使计数更远.
这是您不应该解析ls
的另一个原因.
作为进一步分析的结果,我建议不要在任何情况下使用 globstar 模式,除非以这种方式迭代文件树:
用于进入**做一些$entry"完毕
作为最后的比较,我使用了一个我手边的 Bash 源代码库并这样做了:
shopt -s globstar dotglobdiff <(echo ** | tr ' ' '
') <(find . | sed 's|./||' | sort)0a1>.
我使用 tr
将空格更改为换行符,这仅在此处有效,因为没有名称包含空格.我使用 sed
从 find
的每一行输出中删除前导 ./
.我对 find
的输出进行了排序,因为它通常是未排序的,并且 Bash 的 glob 扩展已经排序.如您所见,diff
的唯一输出是find
输出的当前目录.
.当我做 ls ** |wc -l
输出的行数几乎是原来的两倍.
I know **/*.ext
expands to all files in all subdirectories matching *.ext
, but what is a similar expansion that includes all such files in the current directory as well?
This will work in Bash 4:
ls -l {,**/}*.ext
In order for the double-asterisk glob to work, the globstar
option needs to be set (default: on):
shopt -s globstar
From man bash
:
globstar If set, the pattern ** used in a filename expansion con‐ text will match a files and zero or more directories and subdirectories. If the pattern is followed by a /, only directories and subdirectories match.
Now I'm wondering if there might have once been a bug in globstar processing, because now using simply ls **/*.ext
I'm getting correct results.
Regardless, I looked at the analysis kenorb did using the VLC repository and found some problems with that analysis and in my answer immediately above:
The comparisons to the output of the find
command are invalid since specifying -type f
doesn't include other file types (directories in particular) and the ls
commands listed likely do. Also, one of the commands listed, ls -1 {,**/}*.*
- which would seem to be based on mine above, only outputs names that include a dot for those files that are in subdirectories. The OP's question and my answer include a dot since what is being sought is files with a specific extension.
Most importantly, however, is that there is a special issue using the ls
command with the globstar pattern **
. Many duplicates arise since the pattern is expanded by Bash to all file names (and directory names) in the tree being examined. Subsequent to the expansion the ls
command lists each of them and their contents if they are directories.
Example:
In our current directory is the subdirectory A
and its contents:
A
└── AB
└── ABC
├── ABC1
├── ABC2
└── ABCD
└── ABCD1
In that tree, **
expands to "A A/AB A/AB/ABC A/AB/ABC/ABC1 A/AB/ABC/ABC2 A/AB/ABC/ABCD A/AB/ABC/ABCD/ABCD1" (7 entries). If you do echo **
that's the exact output you'd get and each entry is represented once. However, if you do ls **
it's going to output a listing of each of those entries. So essentially it does ls A
followed by ls A/AB
, etc., so A/AB
gets shown twice. Also, ls
is going to set each subdirectory's output apart:
...
<blank line>
directory name:
content-item
content-item
So using wc -l
counts all those blank lines and directory name section headings which throws off the count even farther.
This a yet another reason why you should not parse ls
.
As a result of this further analysis, I recommend not using the globstar pattern in any circumstance other than iterating over a tree of files in this manner:
for entry in **
do
something "$entry"
done
As a final comparison, I used a Bash source repository I had handy and did this:
shopt -s globstar dotglob
diff <(echo ** | tr ' ' '
') <(find . | sed 's|./||' | sort)
0a1
> .
I used tr
to change spaces to newlines which is only valid here since no names include spaces. I used sed
to remove the leading ./
from each line of output from find
. I sorted the output of find
since it is normally unsorted and Bash's expansion of globs is already sorted. As you can see, the only output from diff
was the current directory .
output by find
. When I did ls ** | wc -l
the output had almost twice as many lines.
这篇关于什么递归地扩展到当前目录中的所有文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!