分隔符之间的匹配文本:贪婪还是懒惰的正则表达式? [英] Matching text between delimiters: greedy or lazy regular expression?

查看：104 发布时间：2020/4/27 3:49:14 regex language-agnostic greedy regex-greedy

本文介绍了分隔符之间的匹配文本:贪婪还是懒惰的正则表达式?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

对于在分隔符(例如<和>)之间匹配文本的常见问题，有两种常见的模式:

For the common problem of matching text between delimiters (e.g. < and >), there's two common patterns:

using the greedy * or + quantifier in the form START [^END]* END, e.g. <[^>]*>, or
using the lazy *? or +? quantifier in the form START .*? END, e.g. <.*?>.

是否有一个特定的理由要一个人胜于另一个人?

Is there a particular reason to favour one over the other?

一些优点:

[^>]*:

More expressive.
Captures newlines regardless of /s flag.
Considered quicker, because the engine doesn't have to backtracks to find a successful match (with [^>] the engine doesn't make choices - we give it only one way to match the pattern against the string).

.*?

没有代码重复"-结束字符仅出现一次.
在结束定界符超过一个字符长的情况下更简单. (在这种情况下，字符类将不起作用)常见的替代方法是(?:(?!END).)*.如果END分隔符是另一种模式，则情况更糟.

No "code duplication" - the end character only appears once.
Simpler in cases the end delimiter is more than a character long. (a character class would not work in this case) A common alternative is (?:(?!END).)*. This is even worse if the END delimiter is another pattern.

这篇关于分隔符之间的匹配文本:贪婪还是懒惰的正则表达式?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文