带有可选组的正则表达式多行模式会跳过有效数据 [英] Regex multiline mode with optional group skip valid data

查看:173
本文介绍了带有可选组的正则表达式多行模式会跳过有效数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑下一个示例:

$payload = '
ababaaabbb =%=
ababaaabbb =%=
ababaa     =%=
';

$pattern = '/^[ab]+\s*(?:=%=)?$/m';
preg_match_all($pattern, $payload, $matches);
var_dump($matches);

预期的和实际的匹配结果是:

Expected and actual result of match is:

"ababaaabbb =%="
"ababaaabbb =%="
"ababaa     =%="

但是如果$payload更改为

$payload = '
ababaaabbb =%=
ababaaabbb =%=
ababaa     =%'; // "=" sign removed at EOL

实际结果是

"ababaaabbb =%="
"ababaaabbb =%="

但预期是

"ababaaabbb =%="
"ababaaabbb =%="
"ababaa     "

为什么会这样?由于?,组(?:=%=)?是可选的,并且有效载荷中的最后一个字符串也应出现在匹配结果中.

Why this happen? Group (?:=%=)? is optional due to ? and last string in payload should be also present in match results.

推荐答案

看看您当前的正则表达式图:

Have a look at your current regex graph:

=%= 是可选的(请参阅white spaceEnd of line分支之间的分支方式),但是EOL 是必需的.这意味着在一个或多个ab符号后,以及零个或多个空格之后,必须发生EOL .但是,您在第三行上有=% =>没有匹配.

The =%= is optional (see how the branch between white space and End of line forks), but the EOL is required. That means after one or more a or b symbols, and zero or more whitespaces, EOL must occur. However, you have =% on your 3rd line => NO MATCH.

现在,当您将$锚点移动到可选组中时:

Now, when you move the $ anchor into the optional group:

行尾现在也是可选,并且匹配1+ ab字符和可选空格后,将返回匹配项.

The end of line is now optional, too, and the match will be returned after matching 1+ a or b chars and optional whitespaces.

这篇关于带有可选组的正则表达式多行模式会跳过有效数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆