具有可变数量的组的正则表达式? [英] Regular expression with variable number of groups?

查看:136
本文介绍了具有可变数量的组的正则表达式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以创建一个包含可变数量的组的正则表达式?

Is it possible to create a regular expression with a variable number of groups?

例如,运行后...

Pattern p = Pattern.compile("ab([cd])*ef");
Matcher m = p.matcher("abcddcef");
m.matches();

...我想要像


  • m.group(1) = c

  • m.group(2) = d

  • m.group(3) = d

  • m.group(4) = c

  • m.group(1) = "c"
  • m.group(2) = "d"
  • m.group(3) = "d"
  • m.group(4) = "c".

(背景:我正在解析一些数据行,其中一个字段正在重复。我想避免这些字段的 matcher.find 循环。)

(Background: I'm parsing some lines of data, and one of the "fields" is repeating. I would like to avoid a matcher.find loop for these fields.)

As @Tim Pietzcker在评论中指出, perl6 .NET 具有此功能。

As pointed out by @Tim Pietzcker in the comments, perl6 and .NET have this feature.

推荐答案

根据文档,Java正则表达式不能这样做:

According to the documentation, Java regular expressions can't do this:



组关联的捕获输入始终是
最近匹配的子序列。如果
组由于量化而第二次评估
,那么如果第二次
评估失败,则其b $ b先前捕获的值(如果有)
将被保留。将字符串
aba与表达式(a(b)?)+,
匹配,例如,将第二组设置为
b。所有捕获的输入在每场比赛开始时被丢弃

The captured input associated with a group is always the subsequence that the group most recently matched. If a group is evaluated a second time because of quantification then its previously-captured value, if any, will be retained if the second evaluation fails. Matching the string "aba" against the expression (a(b)?)+, for example, leaves group two set to "b". All captured input is discarded at the beginning of each match.

(强调添加)

这篇关于具有可变数量的组的正则表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆