具有可变组数的正则表达式? [英] Regular expression with variable number of groups?
问题描述
是否可以创建具有可变组数的正则表达式?
Is it possible to create a regular expression with a variable number of groups?
例如在运行此之后...
After running this for instance...
Pattern p = Pattern.compile("ab([cd])*ef");
Matcher m = p.matcher("abcddcef");
m.matches();
...我想要类似的东西
... I would like to have something like
m.group(1)
="c"
m.group(2)
="d"
m.group(3)
="d"
m.group(4)
="c"
.
m.group(1)
="c"
m.group(2)
="d"
m.group(3)
="d"
m.group(4)
="c"
.
(背景:我正在解析一些数据行,其中一个字段"正在重复.我想避免这些字段的 matcher.find
循环.)
(Background: I'm parsing some lines of data, and one of the "fields" is repeating. I would like to avoid a matcher.find
loop for these fields.)
正如@Tim Pietzcker 在评论中指出的那样,perl6 和 .NET 有这个功能.
As pointed out by @Tim Pietzcker in the comments, perl6 and .NET have this feature.
推荐答案
根据文档,Java 正则表达式不能这样做:
According to the documentation, Java regular expressions can't do this:
与一个关联的捕获输入group 总是子序列最近匹配的组.如果一个组被第二次评估因为量化,那么它的先前捕获的值,如果有的话,如果第二个将被保留评估失败.匹配字符串"aba" 反对表达式 (a(b)?)+,例如,将第二组设置为乙".所有捕获的输入都被丢弃在每场比赛开始时.
The captured input associated with a group is always the subsequence that the group most recently matched. If a group is evaluated a second time because of quantification then its previously-captured value, if any, will be retained if the second evaluation fails. Matching the string "aba" against the expression (a(b)?)+, for example, leaves group two set to "b". All captured input is discarded at the beginning of each match.
(强调)
这篇关于具有可变组数的正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!