正则表达式库中特定于语言环境的行为? [英] Locale specific behavior in the regex library?
本文介绍了正则表达式库中特定于语言环境的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
当我使用特定语言环境嵌入正则表达式对象时,它如何影响匹配行为?它影响整理,还是其他什么?
When I imbue a regex object with a particular locale, how does it affect the matching behavior? Does it affect collation, or anything else? I can't seem to find an explanation anywhere.
推荐答案
它至少会影响到以下内容:
It affects at least the following:
- 排序规则:用法语语言环境填充的正则表达式
[af]
应匹配字符é。 - 同样,芬兰语区域设置中的
\w
应匹配字符ä(但[az]
应该不是,在ñ,ä和ö在芬兰语z之后,在德语中,[az]
应匹配ä。 - 在符合Unicode的区域设置中, Unicode等效算法
- 使用POSIX兼容的正则表达式风格(基本,扩展,awk,grep和egrep)时, POSIX字符类应具有区域意义:
[= e =]
应在法语区域设置中匹配é,但不能与英语区域设置匹配。
- Collation: the regex
[a-f]
imbued with a French locale should match the character é. - Similarly,
\w
in a Finnish locale should match the character ä (but[a-z]
should not, as å, ä and ö collate after z in Finnish. In German, however,[a-z]
should match ä.) - In a Unicode compatible locale, the Unicode equivalence algorithm should be used, so that composed forms of a character match a decomposed form and vice versa.
- With a POSIX-compatible regex flavor (basic, extended, awk, grep, and egrep), the POSIX character classes should be locale-aware:
[=e=]
should match é in a French locale but not in an English locale.
这篇关于正则表达式库中特定于语言环境的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文