使用正则表达式匹配递增整数列表 [英] Match list of incrementing integers using regex
问题描述
是否可以匹配以逗号分隔的十进制整数列表,其中列表中的整数总是递增一?
Is it possible to match a list of comma-separated decimal integers, where the integers in the list always increment by one?
这些应该匹配:
0,1,2,3
8,9,10,11
1999,2000,2001
99,100,101
这些不应该匹配(整体而言 - 最后两个具有匹配的子序列):
These should not match (in their entirety - the last two have matching subsequences):
42
3,2,1
1,2,4
10,11,13
推荐答案
是的,当使用支持反向引用和条件的正则表达式引擎时,这是可能的.
Yes, this is possible when using a regex engine that supports backreferences and conditions.
首先,可以将连续数字的列表分解为每对数字连续的列表:
First, the list of consecutive numbers can be decomposed into a list where each pair of numbers are consecutive:
(?=(?&cons))\d+
(?:,(?=(?&cons))\d+)*
,\d+
此处 (?=(?&cons))
是谓词的占位符,可确保两个数字是连续的.该谓词可能如下所示:
Here (?=(?&cons))
is a placeholder for a predicate that ensures that two numbers are consecutive. This predicate might look as follows:
(?<cons>\b(?:
(?<x>\d*)
(?:(?<a0>0)|(?<a1>1)|(?<a2>2)|(?<a3>3)|(?<a4>4)
|(?<a5>5)|(?<a6>6)|(?<a7>7)|(?<a8>8))
(?:9(?= 9*,\g{x}\d (?<y>\g{y}?+ 0)))*
,\g{x}
(?(a0)1)(?(a1)2)(?(a2)3)(?(a3)4)(?(a4)5)
(?(a5)6)(?(a6)7)(?(a7)8)(?(a8)9)
(?(y)\g{y})
# handle the 999 => 1000 case separately
| (?:9(?= 9*,1 (?<z>\g{z}?+ 0)))+
,1\g{z}
)\b)
简单解释一下,第二种处理999,1000
类型对更容易理解——这个答案与匹配 a^nb^n 有关.两者之间的联系是,在这种情况下我们需要匹配9^n ,1 0^n
.
For a brief explanation, the second case handling 999,1000
type pairs is easier to understand -- there is a very detailed description of how it works in this answer concerned with matching a^n b^n. The connection between the two is that in this case we need to match 9^n ,1 0^n
.
第一种情况稍微复杂一些.它的最大部分处理增加十进制数字的简单情况,由于所述数字的数量,这相对冗长:
The first case is slightly more complicated. The largest part of it handles the simple case of incrementing a decimal digit, which is relatively verbose due to the number of said digits:
(?:(?<a0>0)|(?<a1>1)|(?<a2>2)|(?<a3>3)|(?<a4>4)
|(?<a5>5)|(?<a6>6)|(?<a7>7)|(?<a8>8))
(?(a0)1)(?(a1)2)(?(a2)3)(?(a3)4)(?(a4)5)
(?(a5)6)(?(a6)7)(?(a7)8)(?(a8)9)
第一个块将捕获数字是否为 N 到组 aN 中,然后第二个块将使用条件来检查使用了哪些组.如果组 aN 非空,则下一位数字应为 N+1.
The first block will capture whether the digit is N into group aN and the second block will then uses conditionals to check which of these groups was used. If group aN is non-empty, the next digit should be N+1.
第一个案例的其余部分处理诸如 1999,2000
之类的案例.这又落入模式N 9^n, N+1 0^n
,所以这是匹配a^nb^n
和递增a的方法的组合十进制数字.1,2
的简单情况作为 n=0 的极限情况处理.
The remainder of the first case handles cases like 1999,2000
. This again falls into the pattern N 9^n, N+1 0^n
, so this is a combination of the method for matching a^n b^n
and incrementing a decimal digit. The simple case of 1,2
is handled as the limiting case where n=0.
完整的正则表达式:https://regex101.com/r/zG4zV0/1
或者,如果支持递归子模式引用,可以更直接地实现 (?&cons)
谓词:
Alternatively the (?&cons)
predicate can be implemented slightly more directly if recursive subpattern references are supported:
(?<cons>\b(?:
(?<x>\d*)
(?:(?<a0>0)|(?<a1>1)|(?<a2>2)|(?<a3>3)|(?<a4>4)
|(?<a5>5)|(?<a6>6)|(?<a7>7)|(?<a8>8))
(?<y>
,\g{x}
(?(a0)1)(?(a1)2)(?(a2)3)(?(a3)4)(?(a4)5)
(?(a5)6)(?(a6)7)(?(a7)8)(?(a8)9)
| 9 (?&y) 0
)
# handle the 999 => 1000 case separately
| (?<z> 9,10 | 9(?&z)0 )
)\b)
在这种情况下两个语法 9^n ,1 0^n
, n>=1 和 prefix N 9^n , prefix N+1 0^n
, n>=0 几乎只是显式写出.
In this case the two grammars 9^n ,1 0^n
, n>=1 and prefix N 9^n , prefix N+1 0^n
, n>=0 are pretty much just written out explicitly.
完整的替代正则表达式:https://regex101.com/r/zG4zV0/3
Complete alternative regex: https://regex101.com/r/zG4zV0/3
这篇关于使用正则表达式匹配递增整数列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!