正则表达式意外匹配 [英] Unexpected match of regex
问题描述
我希望正则表达式模式 ab{,2}c
只与 a
匹配,后跟 0、1 或 2 个 b
s,后跟 c
.
I expect the regex pattern ab{,2}c
to match only with a
followed by 0, 1 or 2 b
s, followed by c
.
它在很多语言中都是这样工作的,例如 Python.但是,在 R 中:
It works that way in lots of languages, for instance Python. However, in R:
grepl("ab{,2}c", c("ac", "abc", "abbc", "abbbc", "abbbbc"))
# [1] TRUE TRUE TRUE TRUE FALSE
我对第四个 TRUE
感到惊讶.在 ?regex
中,我可以阅读:
I'm surprised by the 4th TRUE
. In ?regex
, I can read:
{n,m}
前一项至少匹配 n
次,但不能更多比 m
次.
{n,m}
The preceding item is matched at leastn
times, but not more thanm
times.
所以我同意 {,2}
应该写成 {0,2}
是一个有效的模式(不像在 Python 中,文档明确指出省略n
指定下限为零).
So I agree that {,2}
should be written {0,2}
to be a valid pattern (unlike in Python, where the docs state explicitly that omitting n
specifies a lower bound of zero).
但是使用 {,2}
应该会抛出错误而不是返回误导性的匹配!我是否遗漏了什么,还是应该将其报告为错误?
But then using {,2}
should throw an error instead of returning misleading matches! Am I missing something or should this be reported as a bug?
推荐答案
{,2}
的行为不是预期的,这是一个错误.如果你看看TRE源代码,tre_parse_bound
方法,您将看到 min
变量值在引擎尝试初始化最小边界之前设置为 -1
.如果量词中缺少最小值,重复"的数量似乎是最大值的数量 + 1
(好像重复数量等于 max - min = max -(-1) = max+1
).
The behavior with {,2}
is not expected, it is a bug. If you have a look at the TRE source code, tre_parse_bound
method, you will see that the min
variable value is set to -1
before the engine tries to initialize the minimum bound. It seems that the number of "repeats" in case the minimum value is missing in the quantifier is the number of maximum value + 1
(as if the repeat number equals max - min = max - (-1) = max+1
).
因此,a{,}
匹配一次出现的 a
.与 a{, }
或 a{ , }
相同.参见R演示,只有abc
与ab{,}c代码>:
So, a{,}
matches one occurrence of a
. Same as a{, }
or a{ , }
. See R demo, only abc
is matched with ab{,}c
:
grepl("ab{,}c", c("ac", "abc", "abbc", "abbbc", "abbbbc"))
grepl("ab{, }c", c("ac", "abc", "abbc", "abbbc", "abbbbc"))
grepl("ab{ , }c", c("ac", "abc", "abbc", "abbbc", "abbbbc"))
## => [1] FALSE TRUE FALSE FALSE FALSE
这篇关于正则表达式意外匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!