正则表达式整理符号 [英] Regex collating symbols
问题描述
我试图了解校对符号"匹配是如何工作的,但我并没有得出结论.我知道这意味着匹配一个确切的序列,而不仅仅是匹配字符,即:
I tried to understand how 'collating symbols' match works but I did not come out this. I understood that it means matching an exact sequence instead of just the character(s), that is:
echo "ciiiao" | grep '[oa]' --> output 'ciiiao'
echo "ciiiao" | grep '[[.oa.]]' --> no output
echo "ciiiao" | grep '[[.ia.]]' --> output 'ciiiao'
但是,第三个命令不起作用.我错了还是我误解了什么?
However, the third command does not work. Am I wrong or I misinterpret something?
我已阅读此正则表达式教程.
I have read this regexp tutorial.
推荐答案
POSIX正则表达式规范的元素,并且并不广泛支持.
Collating symbols are typically used when a digraph is treated like a single character in a language. They are an element of the POSIX regular expression specification, and are not widely supported.
例如,在某些语言中,"sh"被视为单个字符.假定语言环境文件已定义它(只有在当前语言环境中定义了校对符号,该符号才会起作用),则将归类符号 [[.sh.]]
视为单个字符.同样,搜索.
或 [^ a]
也可以匹配"sh".如果语言环境在字母中的"h"之后定义了"sh",则 [g-i]
也将匹配"sh".
For example, in some languages "sh" is treated as a single character. Assuming the locale file defines it (a collating symbol will only work if it is defined in the current locale), the collating symbol [[.sh.]]
is treated like a single character. Likewise, searching for .
or [^a]
could also match "sh." If the locale defined "sh" as following "h" in the alphabet, then [g-i]
would also match "sh."
这篇关于正则表达式整理符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!