使用“全部匹配"模式的正则表达式 [英] Regular Expression Using the Dot-Matches-All Mode

查看:205
本文介绍了使用“全部匹配"模式的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通常.与换行符不匹配,除非我使用(?s )标志指定要这样做的引擎.我使用Perl样式regexp模式在编辑器的(UltraEdit v14.10)regexp引擎上尝试了此regexp:

Normally the . doesn't match newline unless I specify the engine to do so with the (?s) flag. I tried this regexp on my editor's (UltraEdit v14.10) regexp engine using Perl style regexp mode:

(?s).*i

搜索文本包含多行,每行包含许多'i'字符.

The search text contains multiple lines and each line contains many 'i' characters.

我希望上面的正则表达式意味着:搜索尽可能多的字符(因为使用?s ..现在匹配包括换行符在内的所有字符)(因为*)的贪婪直到到达字符'i'.

I expect the above regexp means: search as many characters (because with the '?s' the . now matches anything including newline) as possible (because of the greediness for *) until reaching the character 'i'.

这应该表示从 last 句子的第一个字符到最后一个'i'"(贪婪应该到达最后一个句子,对吧?).

This should mean "from the first character to the last 'i' in the last sentence" (greediness should reach the last sentence, right?).

但是通过UltraEdit的测试,结果是从包含i的第一句中的第一个字符到最后一个'i'".这个结果正确吗?我对reg表达式有任何错误的解释吗?

But with UltraEdit's test, it turns out to be "from the first character to the last 'i' in the first sentence that contains an i". Is this result correct? Did I make any wrong interpretation of my reg expression?

例如给出此文字

aaa
bbb
aiaiaiaiaa  
bbbicicid

aaa
bbb
aiaiaiai

匹配.但我希望:

aaa
bbb
aiaiaiaiaa  
bbbicici

推荐答案

您的正则表达式正确,对它的性能期望也正确.

Your regex is correct, and so are your expectations of its performance.

这是UltraEdit的regex实现中的一个众所周知的错误,我已经反复写过以支持它.据我所知,它仍然没有得到修复.问题似乎在于以下事实:UE的正则表达式实现实质上是基于行的,并且仅在必要时才将其他行纳入匹配项.因此.*将在当前行贪婪地匹配,但是如果不必为了达到匹配而不会越过换行边界.

This is a long-known bug in UltraEdit's regex implementation which I have written repeatedly to support about. As far as I know, it still hasn't been fixed. The problem appears to lie in the fact that UE's regex implementation is essentially line-based, and additional lines are taken into the match only if necessary. So .* will match greedily on the current line, but it will not cross a newline boundary if it doesn't have to in order to achieve a match.

还有一些其他带有行尾的细微错误.例如,lookbehind也不适用于换行符.

There are some other subtle bugs with line endings. For example, lookbehind doesn't work across newlines, either.

写入IDM支持,或更改为具有正则表达式支持的编辑器.我都做过.

Write to IDM support, or change to an editor with decent regex support. I did both.

这篇关于使用“全部匹配"模式的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆