用PHP正则表达式匹配行尾的差异 [英] Difference in matching end of line with PHP regex

查看：223 发布时间：2020/5/27 18:36:07 php regex php-7 php-7.3

本文介绍了用PHP正则表达式匹配行尾的差异的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给出代码:

$my_str = '
Rollo is*
My dog*
And he\'s very*
Lovely*
';

preg_match_all('/\S+(?=\*$)/m', $my_str, $end_words);
print_r($end_words);

在PHP 7.3.2(XAMPP)中，我得到了意外的输出

In PHP 7.3.2 (XAMPP) I get the unexpected output

Array ( [0] => Array ( ) )

在PHP 7.0.33的 PHPFiddle 中，我得到了我所期望的:

Whereas in PHPFiddle, on PHP 7.0.33, I get what I expected:

Array ( [0] => Array ( [0] => is [1] => dog [2] => very [3] => Lovely ) )

有人能告诉我为什么我会得到这种区别吗，7.0.33之后的REGEX行为是否有所改变?

Can anyone tell me why I'm getting this difference, whether something changed in REGEX behaviour after 7.0.33?

推荐答案

似乎在您所拥有的环境中，PCRE库是在没有PCRE_NEWLINE_ANY选项的情况下编译的，而在多行模式下的$仅在LF符号和.匹配除LF之外的任何符号.

It seems that in the environment you have, the PCRE library was compiled without the PCRE_NEWLINE_ANY option, and $ in the multiline mode only matches before the LF symbol and . matches any symbol but LF.

您可以使用PCRE (*ANYCRLF)动词来修复它:

You can fix it by using the PCRE (*ANYCRLF) verb:

'~(*ANYCRLF)\S+(?=\*$)~m'

(*ANYCRLF)指定换行符:(*CR)，(*LF)或(*CRLF)，并且等效于PCRE_NEWLINE_ANY选项.请参见 PCRE文档:

(*ANYCRLF) specifies a newline convention: (*CR), (*LF) or (*CRLF) and is equivalent to PCRE_NEWLINE_ANY option. See the PCRE documentation:

PCRE_NEWLINE_ANY指定应识别任何Unicode换行符序列.

PCRE_NEWLINE_ANY specifies that any Unicode newline sequence should be recognized.

最后，此PCRE动词使.能够匹配任何字符，但CR和LF符号匹配，并且$会在这两个字符中的任何一个之前匹配.

In the end, this PCRE verb enables . to match any char BUT a CR and LF symbols and $ will match right before either of these two chars.

在 rexegg.com 上了解有关此动词和其他动词的更多信息:

See more about this and other verbs at rexegg.com:

默认情况下，编译PCRE时，您告诉它遇到.时应将其视为换行符(因为点与换行符不匹配，除非在

By default, when PCRE is compiled, you tell it what to consider to be a line break when encountering a . (as the dot it doesn't match line breaks unless in dotall mode), as well the ^ and $ anchors' behavior in multiline mode. You can override this default with the following modifiers:

✽(*CR)只有回车符被认为是换行符
✽(*LF)仅将换行符视为换行符(如Unix上一样)
✽(*CRLF)仅将回车符后跟换行符视为换行符(如Windows上一样)
✽(*ANYCRLF)以上三个条件中的任何一个都被视为换行符
✽(*ANY)任何Unicode换行序列都被视为换行符

✽ (*CR) Only a carriage return is considered to be a line break
✽ (*LF) Only a line feed is considered to be a line break (as on Unix)
✽ (*CRLF) Only a carriage return followed by a line feed is considered to be a line break (as on Windows)
✽ (*ANYCRLF) Any of the above three is considered to be a line break
✽ (*ANY) Any Unicode newline sequence is considered to be a line break

例如，(*CR)\w+.\w+匹配 Line1 \ nLine2 ，因为该点能够匹配 \ n ，这不认为是换行符.请参见演示.

For instance, (*CR)\w+.\w+ matches Line1\nLine2 because the dot is able to match the \n, which is not considered to be a line break. See demo.

这篇关于用PHP正则表达式匹配行尾的差异的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用PHP正则表达式匹配行尾的差异 [英] Difference in matching end of line with PHP regex

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录关闭

用PHP正则表达式匹配行尾的差异 [英] Difference in matching end of line with PHP regex

问题描述

推荐答案

相关文章

PHP最新文章

热门教程

热门工具

登录 关闭

登录关闭