使用作为bash脚本参数传递的glob表达式 [英] Using a glob expression passed as a bash script argument
问题描述
为什么myscript
具有var=$1
且与./myscript
带有var=foo*
硬编码的调用相同,为什么不调用./myscript foo*
?
Why isn't invoking ./myscript foo*
when myscript
has var=$1
the same as invoking ./myscript
with var=foo*
hardcoded?
我在编写的bash脚本中遇到了一个奇怪的问题.我敢肯定有一个简单的解释,但我无法弄清楚.
I've come across a weird issue in a bash script I'm writing. I am sure there is a simple explanation, but I can't figure it out.
我正在尝试传递一个命令行参数,以在脚本中将其分配为变量.
I am trying to pass a command line argument to be assigned as a variable in the script.
我希望脚本允许以下两个命令行参数:
I want the script to allow 2 command line arguments as follows:
$ bash my_bash_script.bash args1 args2
在我的脚本中,我分配了如下变量:
In my script, I assigned variables like this:
ARGS1=$1
ARGS2=$2
Args 1是要添加到输出文件中的字符串描述符.
Args 1 is a string descriptor to add to the output file.
Args 2是一组目录:"dir1,dir2,dir3",我将其作为dir*
Args 2 is a group of directories: "dir1, dir2, dir3", which I am passing as dir*
当我在脚本中将dir*
分配给ARGS2时,它可以正常工作,但是当我将dir*
作为第二个命令行参数传递时,它仅将dir1
包含在dir*
的通配符扩展中.
When I assign dir*
to ARGS2 in the script it works fine, but when I pass dir*
as the second command line argument, it only includes dir1
in the wildcard expansion of dir*
.
我认为这与外壳如何处理通配符(即使以args形式传递)有关,但我不太了解.
I assume this has something to do with how the shell handles wildcards (even when passed as args), but I don't really understand it.
任何帮助将不胜感激.
我有一组目录:
dir_1_y_map, dir_1_x_map, dir_2_y_map, dir_2_x_map,
... dir_10_y_map, dir_10_x_map...
在这些目录中,我尝试通过*.status
访问扩展名为".status"
的文件,并通过*report.txt
访问扩展名为".report.txt"
的文件.
Inside these directories I am trying to access a file with extension ".status"
via *.status
, and ".report.txt"
via *report.txt
.
我想将dir_*_map
作为第二个参数传递给脚本,并将其存储在变量ARGS2中,然后使用它在每个目录中搜索".status"
和".report"
文件.
I want to pass dir_*_map
as the second argument to the script and store it in the variable ARGS2, then use it to search within each of the directories for the ".status"
and ".report"
files.
问题在于,从命令行传递dir_*_map
不会给出目录列表,而只会给出列表中的第一项.如果我在脚本中分配了变量ARGS2=dir_*_map
,它将按预期工作.
The issue is that passing dir_*_map
from the command line doesn't give the list of directories, but rather just the first item in the list. If I assign the variable ARGS2=dir_*_map
within the script, it works as I intend.
事实证明,在引号中传递第二个参数可以使通配符扩展适用于"dir_*_map"
It turns out that passing the second argument in quotes allowed the wildcard expansion to work appropriately for "dir_*_map"
#!/usr/bin/env bash
ARGS1=$1
ARGS2=$2
touch $ARGS1".extension"
for i in /$ARGS2/*.status
do
grep -e "string" $i >> $ARGS1".extension"
done
这是脚本的示例调用:
sh ~/path/to/script descriptor "dir_*_map"
我不完全理解何时/为什么必须在引号中传递一些参数,但我认为这与for循环中的通配符扩展有关.
I don't fully understand when/why some arguments must be passed in quotes, but I assume it has to do with the wildcard expansion in the for loop.
推荐答案
解决为什么"
与var=foo*
中一样,赋值不会扩展全局范围-也就是说,当您运行var=foo*
时,文字字符串foo*
会放入变量foo
中,而不是与foo*
.
Addressing the "why"
Assignments, as in var=foo*
, don't expand globs -- that is, when you run var=foo*
, the literal string foo*
is put into the variable foo
, not the list of files matching foo*
.
相比之下,在命令行上不加引号地使用foo*
会扩大全局范围,将其替换为单个名称列表,每个名称作为单独的参数传递..
By contrast, unquoted use of foo*
on a command line expands the glob, replacing it with a list of individual names, each of which is passed as a separate argument.
因此,除非不存在与该glob表达式匹配的文件,否则运行./yourscript foo*
不会将foo*
作为$1
传递;相反,它变成类似于./yourscript foo01 foo02 foo03
的形式,每个参数都位于命令行的不同位置.
Thus, running ./yourscript foo*
doesn't pass foo*
as $1
unless no files matching that glob expression exist; instead, it becomes something like ./yourscript foo01 foo02 foo03
, with each argument in a different spot on the command line.
运行./yourscript "foo*"
用作变通办法的原因是脚本内部未引用的扩展允许在以后扩展glob.但是,这是一种不好的做法:全局扩展与字符串拆分同时发生(这意味着依靠此行为将删除您传递包含在IFS
中找到的字符的文件名的能力,通常为空格),并且这也意味着您无法传递文字文件名,也可以将它们解释为glob(如果您有一个名为[1]
的文件和一个名为1
的文件,则传递[1]
的文件将始终被替换为1
).
The reason running ./yourscript "foo*"
functions as a workaround is the unquoted expansion inside the script allowing the glob to be expanded at that later time. However, this is bad practice: glob expansion happens concurrent with string-splitting (meaning that relying on this behavior removes your ability to pass filenames containing characters found in IFS
, typically whitespace), and also means that you can't pass literal filenames when they could also be interpreted as globs (if you have a file named [1]
and a file named 1
, passing [1]
would always be replaced with 1
).
构建此代码的惯用方式是shift
删除第一个参数,然后遍历后续参数,例如:
The idiomatic way to build this would be to shift
away the first argument, and then iterate over subsequent ones, like so:
#!/bin/bash
out_base=$1; shift
shopt -s nullglob # avoid generating an error if a directory has no .status
for dir; do # iterate over directories passed in $2, $3, etc
for file in "$dir"/*.status; do # iterate over files ending in .status within those
grep -e "string" "$file" # match a single file
done
done >"${out_base}.extension"
如果单个目录中有许多.status
文件,则可以通过使用find
调用带有尽可能多参数的grep
来提高所有效率,而不是每次单独调用grep
-文件基础:
If you have many .status
files in a single directory, all this can be made more efficient by using find
to invoke grep
with as many arguments as possible, rather than calling grep
individually on a per-file basis:
#!/bin/bash
out_base=$1; shift
find "$@" -maxdepth 1 -type f -name '*.status' \
-exec grep -h -- /dev/null '{}' + \
>"${out_base}.extension"
以上两个脚本都希望通过 not 的glob在调用shell上被引用.因此,用法的形式为:
Both scripts above expect the globs passed not to be quoted on the invoking shell. Thus, usage is of the form:
# being unquoted, this expands the glob into a series of separate arguments
your_script descriptor dir_*_map
这比将glob传递到脚本(然后将其扩展以检索要使用的实际文件)要好得多.它可以正确处理包含空格的文件名(其他做法则不这样),以及名称本身就是glob表达式的文件.
This is considerably better practice than passing globs to your script (which then is required to expand them to retrieve the actual files to use); it works correctly with filenames containing whitespace (which the other practice doesn't), and files whose names are themselves glob expressions.
其他一些要点:
- 始终在扩展名两边加上双引号!否则,将导致附加的字符串拆分和全局扩展(按此顺序)步骤被应用.如果像
"$dir"/*.status
那样想要进行通配符,请在通配符表达式开始之前结束引号. -
for dir; do
完全等同于for dir in "$@"; do
,后者会遍历参数.不要犯错使用for dir in $*; do
或for dir in $@; do
的错误!后面的这些调用将列表的每个元素与IFS
的第一个字符(默认情况下,该字符按顺序包含空格,制表符和换行符)组合在一起,然后将结果字符串拆分为在其中找到的任何IFS
字符,然后将结果列表的每个组成部分扩展为全域. - 将
/dev/null
作为参数传递给grep
是一种安全措施:它确保您在单参数和多参数案例之间没有不同的行为(例如,grep
默认为打印仅在传递了多个参数时才在输出中使用文件名),并确保如果没有传递任何其他文件名,则无法挂起grep
尝试从stdin读取(find
在这里不会做,但是xargs
可以). - 为您自己的变量使用小写名称(与系统和外壳程序提供的变量全为大写字母相反)符合POSIX指定的约定;请参阅有关环境变量的POSIX规范的第四段,请牢记环境变量和外壳程序变量共享一个名称空间.
- Always put double quotes around expansions! Failing to do so results in the additional steps of string-splitting and glob expansion (in that order) being applied. If you want globbing, as in the case of
"$dir"/*.status
, then end the quotes before the glob expression starts. for dir; do
is precisely equivalent tofor dir in "$@"; do
, which iterates over arguments. Don't make the mistake of usingfor dir in $*; do
orfor dir in $@; do
instead! These latter invocations combine each element of the list with the first character ofIFS
(which, by default, contains the space, the tab and the newline in that order), then splits the resulting string on anyIFS
characters found within, then expands each component of the resulting list as a glob.- Passing
/dev/null
as an argument togrep
is a safety measure: It ensures that you don't have different behavior between the single-argument and multi-argument cases (as an example,grep
defaults to printing filenames within output only when passed multiple arguments), and ensures that you can't havegrep
hang trying to read from stdin if it's passed no additional filenames at all (whichfind
won't do here, butxargs
can). - Using lower-case names for your own variables (as opposed to system- and shell-provided variables, which have all-uppercase names) is in accordance with POSIX-specified convention; see fourth paragraph of the POSIX specification regarding environment variables, keeping in mind that environment variables and shell variables share a namespace.
这篇关于使用作为bash脚本参数传递的glob表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!