嵌套正则表达式向前和向后看 [英] Nested regex lookahead and lookbehind

查看:152
本文介绍了嵌套正则表达式向前和向后看的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在正则表达式中嵌套的'+'/'-'lookahead/lookbehin后面有问题.

I am having problems with the nested '+'/'-' lookahead/lookbehind in regex.

假设我想用'%'更改字符串中的'*',并说'\'转义下一个字符. (将正则表达式转换为类似命令^^的sql).

Let's say that I want to change the '*' in a string with '%' and let's say that '\' escapes the next character. (Turning a regex to sql like command ^^).

所以字符串

  • '*test*'应该更改为'%test%'
  • '\\*test\\*'-> '\\%test\\%',但
  • '\*test\*''\\\*test\\\*'应该保持不变.
  • '*test*' should be changed to '%test%',
  • '\\*test\\*' -> '\\%test\\%', but
  • '\*test\*' and '\\\*test\\\*' should stay the same.

我尝试过:

(?<!\\)(?=\\\\)*\*      but this doesn't work
(?<!\\)((?=\\\\)*\*)    ...
(?<!\\(?=\\\\)*)\*      ...
(?=(?<!\\)(?=\\\\)*)\*  ...

与上述示例中的*匹配的正确正则表达式是什么?

What is the correct regex that will match the '*'s in examples given above?

(?<!\\(?=\\\\)*)\*(?=(?<!\\)(?=\\\\)*)\*之间的区别是什么?或者如果它们本质上是错误的,那么具有这种视觉构造的正则表达式之间的区别是什么?

What is the difference between (?<!\\(?=\\\\)*)\* and (?=(?<!\\)(?=\\\\)*)\* or if these are essentially wrong the difference between regex that have such a visual construction?

推荐答案

要查找未转义的字符,您需要查找以偶数个(或零个)转义字符开头的字符.这是相对简单的.

To find an unescaped character, you would look for a character that is preceded by an even number of (or zero) escape characters. This is relatively straight-forward.

(?<=(?<!\\)(?:\\\\)*)\*        # this is explained in Tim Pietzcker' answer

不幸的是,许多正则表达式引擎不支持可变长度的后向查找,因此我们必须用前瞻代替:

Unfortunately, many regex engines do not support variable-length look-behind, so we have to substitute with look-ahead:

(?=(?<!\\)(?:\\\\)*\*)(\\*)\*  # also look at ridgerunner's improved version

用组1的内容和%符号代替.

Replace this with the contents of group 1 and a % sign.

说明

(?=           # start look-ahead
  (?<!\\)     #   a position not preceded by a backslash (via look-behind)
  (?:\\\\)*   #   an even number of backslashes (don't capture them)
  \*          #   a star
)             # end look-ahead. If found,
(             # start group 1
  \\*         #   match any number of backslashes in front of the star
)             # end group 1
\*            # match the star itself

前瞻确保仅考虑偶数个反斜杠.无论如何,由于先行查询不会使字符串中的位置提前,因此无法将它们匹配到一个组中.

The look-ahead makes sure only even numbers of backslashes are taken into account. Anyway, there is no way around matching them into a group, since the look-ahead does not advance the position in the string.

这篇关于嵌套正则表达式向前和向后看的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆