非贪婪的正则表达式或否定的字符类哪个更好? [英] Which would be better non-greedy regex or negated character class?
问题描述
我需要从字符串@anything_here@dhhhd@shdjhjs@
中匹配@anything_here@
.因此,我使用了以下正则表达式.
I need to match @anything_here@
from a string @anything_here@dhhhd@shdjhjs@
. So I'd used following regex.
^@.*?@
或
^@[^@]*@
两种方法都可以,但是我想知道哪种方法更好.非贪婪重复的正则表达式或字符类否定的正则表达式?
Both way it's work but I would like to know which one would be a better solution. Regex with non-greedy repetition or regex with negated character class?
推荐答案
如果可能的话,与延迟匹配相比,否定的字符类通常更受欢迎.
Negated character classes should usually be prefered over lazy matching, if possible.
如果正则表达式成功,则^@[^@]*@
可以单步匹配@
之间的内容,而^@.*?@
需要扩展@
之间的每个字符.
If the regex is successful, ^@[^@]*@
can match the content between @
s in a single step, while ^@.*?@
needs to expand for each character between @
s.
当失败时(对于没有结尾的@
而言),大多数正则表达式引擎会施加一点魔力并将内部将[^@]*
视为[^@]*+
,因为在@
与非@
,因此它将匹配到字符串的末尾,识别出丢失的@
而不是回溯,但会立即失败. .*?
将照常扩展字符.
When failing (for the case of no ending @
) most regex engines will apply a little magic and internally treat [^@]*
as [^@]*+
, as there is a clear cut border between @
and non-@
, thus it will match to the end of the string, recognize the missing @
and not backtrack, but instantly fail. .*?
will expand character for character as usual.
在较大的上下文中使用时,[^@]*
也将永远不会扩展到结尾@
的边界,而对于延迟匹配则非常有可能.例如. ^@[^@]*a[^@]*@
与@bbbb@a@
不匹配,而^@.*?a.*?@
将与之匹配.
When used in larger contexts, [^@]*
will also never expand over the borders of the ending @
while this is very well possible for the lazy matching. E.g. ^@[^@]*a[^@]*@
won't match @bbbb@a@
while ^@.*?a.*?@
will.
请注意,[^@]
也将匹配换行符,而.
则不匹配(在大多数正则表达式引擎中,除非在单行模式下使用).您可以通过在否定词中添加换行符来避免这种情况-如果不需要的话.
Note that [^@]
will also match newlines, while .
doesn't (in most regex engines and unless used in singleline mode). You can avoid this by adding the newline character to the negation - if it is not wanted.
这篇关于非贪婪的正则表达式或否定的字符类哪个更好?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!