嵌套正则表达式向前和向后看 [英] Nested regex lookahead and lookbehind
问题描述
我在正则表达式中嵌套的'+'/'-'lookahead/lookbehin后面有问题.
I am having problems with the nested '+'/'-' lookahead/lookbehind in regex.
假设我想用'%'
更改字符串中的'*'
,并说'\'
转义下一个字符. (将正则表达式转换为类似命令^^的sql).
Let's say that I want to change the '*'
in a string with '%'
and let's say that '\'
escapes the next character. (Turning a regex to sql like command ^^).
所以字符串
-
'*test*'
应该更改为'%test%'
, -
'\\*test\\*'
->'\\%test\\%'
,但 -
'\*test\*'
和'\\\*test\\\*'
应该保持不变.
'*test*'
should be changed to'%test%'
,'\\*test\\*'
->'\\%test\\%'
, but'\*test\*'
and'\\\*test\\\*'
should stay the same.
我尝试过:
(?<!\\)(?=\\\\)*\* but this doesn't work
(?<!\\)((?=\\\\)*\*) ...
(?<!\\(?=\\\\)*)\* ...
(?=(?<!\\)(?=\\\\)*)\* ...
与上述示例中的*匹配的正确正则表达式是什么?
What is the correct regex that will match the '*'s in examples given above?
(?<!\\(?=\\\\)*)\*
和(?=(?<!\\)(?=\\\\)*)\*
之间的区别是什么?或者如果它们本质上是错误的,那么具有这种视觉构造的正则表达式之间的区别是什么?
What is the difference between (?<!\\(?=\\\\)*)\*
and (?=(?<!\\)(?=\\\\)*)\*
or if these are essentially wrong the difference between regex that have such a visual construction?
推荐答案
要查找未转义的字符,您需要查找以偶数个(或零个)转义字符开头的字符.这是相对简单的.
To find an unescaped character, you would look for a character that is preceded by an even number of (or zero) escape characters. This is relatively straight-forward.
(?<=(?<!\\)(?:\\\\)*)\* # this is explained in Tim Pietzcker' answer
不幸的是,许多正则表达式引擎不支持可变长度的后向查找,因此我们必须用前瞻代替:
Unfortunately, many regex engines do not support variable-length look-behind, so we have to substitute with look-ahead:
(?=(?<!\\)(?:\\\\)*\*)(\\*)\* # also look at ridgerunner's improved version
用组1的内容和%
符号代替.
Replace this with the contents of group 1 and a %
sign.
说明
(?= # start look-ahead
(?<!\\) # a position not preceded by a backslash (via look-behind)
(?:\\\\)* # an even number of backslashes (don't capture them)
\* # a star
) # end look-ahead. If found,
( # start group 1
\\* # match any number of backslashes in front of the star
) # end group 1
\* # match the star itself
前瞻确保仅考虑偶数个反斜杠.无论如何,由于先行查询不会使字符串中的位置提前,因此无法将它们匹配到一个组中.
The look-ahead makes sure only even numbers of backslashes are taken into account. Anyway, there is no way around matching them into a group, since the look-ahead does not advance the position in the string.
这篇关于嵌套正则表达式向前和向后看的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!