正则表达式:PHP后向固定长度断言限制的解决方法 [英] Regex: Workaround for PHP's look-behind fixed-length assert limitation
问题描述
我试图了解有关环视断言的更多信息,并且发现了此线程 ,因为PHP要求在后面的断言必须是固定长度的,所以应该在某些引擎中使用它们的解决方案,但在PHP的引擎中不能使用它们.
I'm trying to understand more about look-around asserts and I found this thread, where their solution is supposed to work in some engines but not PHP's, because of PHP's requiring the look-behind assert to be fixed-length.
我想要的是使相同的方案在PHP中工作,或者至少知道是否有可能.
What I want is to make the same scenario work in PHP or at least know if it's possible at all.
我试图减少正则表达式规则的解释,因此它与上面提到的线程不同,但是遵循相同的原理.
I tried to reduce the regex rules explanation, so it's not the same as in the thread I mention above, but it follows the same principle.
需要匹配由三个部分组成的字符串:
Need to match a string built in three parts:
- 以任意数量的字母数字字符开头
- 不包含"abc-",后跟3至5个数字和/或连字符
- 以".htm"或".html"结尾
因此,这些将匹配:
- xxxyz-123.html
- xx123-abc.htm
- xxabc123.html
- xxabc-123-45.htm
但是这些不匹配:
- xxabc-4324.htm
- xxabc-1-2.html
- xxac-12-34.txt
- xxabc-12345.htm
我一直在尝试下面的正则表达式模式的一些变体,但它无法正常工作-由于固定长度的限制,这种特殊情况是这样的:
I've been trying with some variations of the regex pattern below but it's not working - this particular case because of the fixed-length limitation:
.*(?<!abc-[\d-]{3,5})\.htm[^l]?$
我还使用了不同的测试字符串,而忽略了3-5的范围部分,只关注了准确的数字(例如3个数字和/或连字符),并使用了下面的正则表达式,但仍然无法使用,这就是为什么我决定寻求帮助:
I also used different test strings and forgot about the 3-5 range part, focusing only on exactly , say, 3 numbers and/or hyphens, and used the regex below, and it still doesn't work, which is why I decided to ask for help on this:
.*(?<!abc-[\d-]{3})\.htm[^l]?$
你们中的任何一位正则表达式专家都可以在这里帮助我吗?
Could anyone of you regex gurus help me out here?
修改
这是我测试的PHP代码:
This is my testing PHP code:
$regex = "/^(?!.*abc-[\d-]{3,5})[a-zA-Z0-9-]+\.html?$/";
foreach ( $matching2 as $k => $v ) {
$matches = preg_match( $regex, $v );
echo '"', $v, '"', ( $matches != 0 ) ? ' matches' : ' doesn\'t match', '<br />';
}
推荐答案
为什么需要反向查看?为什么不只使用超前行呢?
Why do you need to need to look at that in reverse? Why not just use a lookahead?
^(?!.*abc-[\d-]{3,5}[^\d-])[a-zA-Z0-9-]+\.html?$
这将简单地开始查看字符串的开头,并且超前尝试尝试在字符串中的任意位置(.*
)查找不允许的字符串.如果是这样,则超前使模式失败.这还包括以下要求:字符串只能由字母数字和连字符组成.
This will simply start looking at the beginning of the string and the lookahead tries to find the disallowed string anywhere (.*
) in the string. If it does, the lookahead makes the pattern fail. This also include the requirement, that the string consists only of alphanumerics and hyphens.
这与您链接的问题所使用的解决方案相同. Perl也无法应付可变长度的回溯. 仅 .NET可以.
This is by the way the same solution that is used for the question you linked. Perl cannot cope with variable-length lookbehinds either. Only .NET can.
另一注:如果您遇到一个示例,其中您实际上确实需要在后面进行可变长的查找(但不需要在前面进行可变长的查找)...反转字符串(也包括模式) , 当然). ;)
Another note: if you ever encounter an example where you actually do need a variable-length lookbehind (but not a variable-length lookahead)... reverse the string (and the pattern, too, of course). ;)
这篇关于正则表达式:PHP后向固定长度断言限制的解决方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!