后视正则表达式问题 (Ruby) [英] Issue with a Look-behind Regular expression (Ruby)
问题描述
我编写了这个正则表达式来匹配 HTML 页面中的所有 href
和 src
链接;(我知道我应该使用解析器;这只是实验):
I wrote this regex to match all href
and src
links in an HTML page; (I know I should be using a parser; this just experimenting):
/((href|src)\=\").*?\"/
# 不看-后面
它工作正常,但是当我尝试将表达式的第一部分修改为后视模式时:
It works fine, but when I try to modify the first portion of the expression as a look-behind pattern:
/(?<=(href|src)\=\").*?\"/
# 带有后视
它抛出一个错误,指出无效的后视模式".任何想法,后视有什么问题?
It throws an error stating 'invalid look-behind pattern'. Any ideas, whats going wrong with the look-behind?
推荐答案
Lookbehind 有 限制:
Lookbehind has restrictions:
(?<=subexp) look-behind
(?<!subexp) negative look-behind
Subexp of look-behind must be fixed character length.
But different character length is allowed in top level
alternatives only.
ex. (?<=a|bc) is OK. (?<=aaa(?:b|cd)) is not allowed.
In negative-look-behind, captured group isn't allowed,
but shy group(?:) is allowed.
您不能在(否定的)后视中将替代品放在非顶级中.
You cannot put alternatives in a non-top level within a (negative) lookbehind.
将它们放在顶层.您也不需要转义您所做的某些字符.
Put them at the top level. You also don't need to escape some characters that you did.
/(?<=href="|src=").*?"/
这篇关于后视正则表达式问题 (Ruby)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!