如何制作便携式正则表达式? [英] how to make portable regex?

查看:94
本文介绍了如何制作便携式正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正则表达式的哪些功能是标准功能,哪些是特质的?
如果我想在不同的上下文,语言和平台中使用相同的正则表达式,该怎么办?

Which features of regular expressions are standard, and which are idiosyncratic ?
What should I do, and not do, if I want to use the same regex in different context, languages, platforms ?

推荐答案

没有标准,但是如果您希望最大的可移植性,则应坚持使用JavaScript正则表达式支持的功能.所有其他主要样式都支持JS所做的所有工作,并且在各处仅有很小的变化.例如,某些仅支持POSIX字符类符号([:alpha:]),而其他一些则使用Unicode语法(\p{Alpha}).

There is no standard, but if maximum portability is your goal you should stick to the features supported by JavaScript regexes. All of the other major flavors support everything JS does, with only minor variations here and there. For example, some only support the POSIX character-class notation ([:alpha:]), while others use the Unicode syntax (\p{Alpha}).

最麻烦的变化可能是那些影响点(.)和锚点(^$)的变化.例如,JavaScript没有DOTALL(或单行")模式,因此要匹配任何包括换行符,您必须使用[\s\S]之类的技巧.同时,Ruby具有DOTALL模式,但将其称为 multiline 模式-每个人 else 都称其为"multiline"(^$作为行锚)一直有效.

Probably the most troublesome variations are those that affect the dot (.) and the anchors (^ and $). For example, JavaScript has no DOTALL (or "single-line") mode, so to match anything including a newline you have to use a hack like [\s\S]. Meanwhile, Ruby has a DOTALL mode but calls it multiline mode--what everyone else calls "multiline" (^ and $ as line anchors) is how Ruby always works.

也要注意点的完全不匹配(在默认模式下).传统上,这只是换行(\n),但是越来越多的风味正在采用(或至少近似于)

Be aware, too, of exactly what the dot doesn't match (in the default mode). Traditionally that was just the linefeed (\n), but more and more flavors are adopting (or at least approximating) the Unicode guidelines concerning line separators. For example, in Java the dot doesn't match any of [\r\n\u0085\u2028\u2029], while ^ and $ treat \r\n as a single separator and won't match between the two characters.

请注意,我只是在谈论 Perl衍生的风味,例如Python,Ruby,PHP,JavaScript等.包含grep等基于GNU或POSIX的风味没有任何意义. ,awk和MySQL;它们往往功能较少,但这不是您要选择它们的原因.

Note that I'm only talking about Perl-derived flavors, like Python, Ruby, PHP, JavaScript, etc.. It wouldn't make sense to inlcude GNU or POSIX based flavors like grep, awk, and MySQL; they tend to have fewer features, but that's not what you would choose them for anyway.

我也不包括XML Schema风格;它比JavaScript有更多的限制,但是它是一个专门的应用程序.例如,它不支持锚点(^$\A\Z等),因为匹配项始终固定在两端.

I'm also not including the XML Schema flavor; it's much more limited than JavaScript, but it's a specialized application. For example, it doesn't support the anchors (^, $, \A, \Z, etc.) because matches are always anchored at both ends.

这篇关于如何制作便携式正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆