正则表达式:向后看以避免奇数个连续的反斜杠 [英] RegEx: Look-behind to avoid odd number of consecutive backslashes

查看:183
本文介绍了正则表达式:向后看以避免奇数个连续的反斜杠的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有用户输入,其中在方括号内允许使用一些标签.我已经写了正则表达式模式来查找和验证括号内的内容.

I have user input where some tags are allowed inside square brackets. I've already wrote the regex pattern to find and validate what's inside the brackets.

在用户输入字段中,可以使用反斜杠对开括号([)进行转义,也可以使用另一个反斜杠(\)对反斜杠进行转义.我需要使用后向子模式,以避免在打开括号之前出现奇数个连续的反斜杠.

In user input field opening-bracket could ([) be escaped with backslash, also backslash could be escaped with another backslash (\). I need look-behind sub-pattern to avoid odd number of consecutive backslashes before opening-bracket.

此刻,我必须处理类似的事情:

At the moment I must deal with something like this:

(?<!\\)(?:\\\\)*\[(?<inside brackets>.*?)]

它可以正常工作,但是问题在于该代码仍然匹配方括号前面可能存在的成对的连续反斜杠(即使它们是隐藏的),而向后查找则只是检查是否在该对中附加了另一个反斜杠(或直接在开头加上反斜杠) -括号).如果可能的话,我需要避免将它们全都放在后面的组中.

It works fine, but problem is that this code still matches possible pairs of consecutive backslashes in front of brackets (even they are hidden) and look-behind just checks out if there's another single backslash appended to pairs (or directly to opening-bracket). I need to avoid them all inside look-behind group if possible.

my [test] string is ok
my \[test] string is wrong
my \\[test] string is ok
my \\\[test] string is wrong
my \\\\[test] string is ok
my \\\\\[test] string is wrong
...
etc

我使用PHP PCRE

I work with PHP PCRE

推荐答案

我上次检查时,PHP不支持可变长度的lookbehinds.这就是为什么您不能使用简单的解决方案(?<![^\\](?:\\\\)*\\).

Last time I checked, PHP did not support variable-length lookbehinds. That is why you cannot use the trivial solution (?<![^\\](?:\\\\)*\\).

最简单的解决方法是简单地匹配整个内容,而不仅仅是括号部分:

The simplest workaround would be to simply match the entire thing, not just the brackets part:

(?<!\\)((?:\\\\)*)\[(?<inside_brackets>.*?)]

区别在于,现在,如果您在preg_replace中使用该正则表达式,则必须记住在$1之前为替换字符串加上前缀,以恢复其中的反斜杠.

The difference is that now, if you're using that regex in a preg_replace, you gotta remember to prefix the replacement string by $1, to restore the backslashes being there.

这篇关于正则表达式:向后看以避免奇数个连续的反斜杠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆