bash中的=〜运算符支持哪种正则表达式语法? [英] What regex syntax is supported by the =~ operator in bash?
问题描述
我发现我可以在我的4.2.10(1)BASH中使用 =~
运算符代替 expr
命令. (在命令内)它比expr
快得多,并且在循环较大的循环内部,这一事实可能很重要.
I discovered that I can use =~
operator instead of expr
command in my 4.2.10(1) BASH. It is much faster (within a command) than expr
and this fact could be important inside in a loop with large repetition.
我能够使用大多数正则表达式的元字符,但不是全部.
I was able to use most of the meta characters of regular expression but not all.
例如,我可以检查一个字符串是否完全匹配3个重复(一个小写字母,一个数字,一个点):
For example I can check a string matches exactly 3 repetitions of (one small letter, one digit, one dot):
[[ "b3.f5.h3." =~ ^([a-z][0-9]\.){3}$ ]] && echo OK
OK
然后我可以选择匹配的子字符串:
and I can select matched substrings:
[[ "whatis12345thetwo765nmbers" =~ ^[a-z]+([0-9]+)[a-z]+([0-9]+) ]] && \
echo "The two number fields are: ${BASH_REMATCH[1]} ${BASH_REMATCH[2]}"
The two number fields are: 12345 765
But I would like to use more meta characters, such as the ones listed on this TLDP page.
我特别想匹配单词边界: \b
, \B
, \<
,
I would especially like to match word boundaries: \b
, \B
, \<
, \>
.
我试图在高级Bash脚本指南(第18和37章)中找到答案,但是没有成功.
I tried to find an answer in the Advanced Bash-Scripting Guide (in Chapters 18 and 37) but was unsuccessful.
在哪里可以找到 =~
运算符的详细说明?
Where can I find a detailed description of =~
operator?
目前,我只对BASH感兴趣,而对gawk,sed,perl或其他工具不感兴趣.
At the moment I am interested only in BASH and not in gawk, sed, perl or other tools.
推荐答案
=~
支持POSIX ERE,除了本地C库添加的扩展名外,没有其他扩展名(从字面上讲,它调用标准C库的regex调用).因此,关于保证可支持的功能的规范文档(与本地C库可能另外添加的可选功能相对)是ERE上的规范,
=~
supports POSIX ERE with no extensions additional to those added by the local C library (literally, it calls the standard C library's regex calls). Thus, the canonical documentation on features it's guaranteed to support (as opposed to optional features your local C library may add in addition) is the specification on ERE, IEEE 1003.1, section 9.4.
对此进行放大:由一个特定的libc(例如glibc)添加但POSIX规范中不存在的任何内容(例如\<
)不能期望bash支持的所有平台都可以移植
To amplify this: Anything, such as \<
, added by one particular libc (ie. glibc) but not present in the POSIX specification cannot be expected to work portably across all platforms bash supports.
POSIX指定的特殊字符(如9.4.3节中所述,如 )不包括 <
,>
,b
或B
;这些都是GNU扩展,不可移植.
The POSIX-specified special characters (as given in section 9.4.3 of the standard) do not include <
, >
, b
or B
; these are all GNU extensions and nonportable.
这篇关于bash中的=〜运算符支持哪种正则表达式语法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!