正则表达式库中特定于语言环境的行为? [英] Locale specific behavior in the regex library?

查看:177
本文介绍了正则表达式库中特定于语言环境的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我使用特定语言环境嵌入正则表达式对象时,它如何影响匹配行为?它影响整理,还是其他什么?

When I imbue a regex object with a particular locale, how does it affect the matching behavior? Does it affect collation, or anything else? I can't seem to find an explanation anywhere.

推荐答案

它至少会影响到以下内容:

It affects at least the following:


  • 排序规则:用法语语言环境填充的正则表达式 [af] 应匹配字符é。

  • 同样,芬兰语区域设置中的 \w 应匹配字符ä(但 [az] 应该不是,在ñ,ä和ö在芬兰语z之后,在德语中, [az] 应匹配ä。

  • 在符合Unicode的区域设置中, Unicode等效算法

  • 使用POSIX兼容的正则表达式风格(基本,扩展,awk,grep和egrep)时, POSIX字符类应具有区域意义: [= e =] 应在法语区域设置中匹配é,但不能与英语区域设置匹配。

  • Collation: the regex [a-f] imbued with a French locale should match the character é.
  • Similarly, \w in a Finnish locale should match the character ä (but [a-z] should not, as å, ä and ö collate after z in Finnish. In German, however, [a-z] should match ä.)
  • In a Unicode compatible locale, the Unicode equivalence algorithm should be used, so that composed forms of a character match a decomposed form and vice versa.
  • With a POSIX-compatible regex flavor (basic, extended, awk, grep, and egrep), the POSIX character classes should be locale-aware: [=e=] should match é in a French locale but not in an English locale.

这篇关于正则表达式库中特定于语言环境的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆