正则表达式:重复捕获组 [英] Regex: Repeated capturing groups

查看:1058
本文介绍了正则表达式:重复捕获组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须从ASCII文本文件解析一些表。这里的一个局部样本:

I have to parse some tables from an ASCII text file. Here's a partial sample:

QSMDRYCELL   11.00   11.10   11.00   11.00    -.90      11     11000     1.212
RECKITTBEN  192.50  209.00  192.50  201.80    5.21      34      2850     5.707
RUPALIINS   150.00  159.00  150.00  156.25    6.29       4        80      .125
SALAMCRST   164.00  164.75  163.00  163.25    -.45      80      8250    13.505
SINGERBD    779.75  779.75  770.00  773.00    -.89       8        95      .735
SONARBAINS   68.00   69.00   67.50   68.00     .74      11      3050     2.077

表包含文本的第1列和浮点数的8列。我想通过捕捉正则表达式的每一列。

The table consists of 1 column of text and 8 columns of floating point numbers. I'd like to capture each column via regex.

我是很新,正则表达式。这是我想出了一个错误的正则表达式:

I'm pretty new to regular expressions. Here's the faulty regex pattern I came up with:

(\S+)\s+(\s+[\d\.\-]+){8}

但图案仅捕获在第一和最后一栏。使用RegexBuddy也发出如下警告:

But the pattern captures only the first and the last columns. RegexBuddy also emits the following warning:

您重复捕获组
本身。该小组将仅捕捉
最后一次迭代。把捕获
组周围的反复集团
捕获所有的迭代。

You repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations.

我咨询了他们的帮助文件,但我没有一个线索,如何解决这个问题。

I've consulted their help file, but I don't have a clue as to how to solve this.

我怎样才能捕捉分开每一列?

How can I capture each column separately?

在此先感谢

推荐答案

在C#(从的这个例子):

string input = "QSMDRYCELL   11.00   11.10   11.00   11.00    -.90      11     11000     1.212";
string pattern = @"^(\S+)\s+(\s+[\d.-]+){8}$";
Match match = Regex.Match(input, pattern, RegexOptions.MultiLine);
if (match.Success) {
   Console.WriteLine("Matched text: {0}", match.Value);
   for (int ctr = 1; ctr < match.Groups.Count; ctr++) {
      Console.WriteLine("   Group {0}:  {1}", ctr, match.Groups[ctr].Value);
      int captureCtr = 0;
      foreach (Capture capture in match.Groups[ctr].Captures) {
         Console.WriteLine("      Capture {0}: {1}", 
                           captureCtr, capture.Value);
         captureCtr++; 
      }
   }
}



输出:

Output:

Matched text: QSMDRYCELL   11.00   11.10   11.00   11.00    -.90      11     11000     1.212
...
    Group 2:      1.212
         Capture 0:  11.00
         Capture 1:    11.10
         Capture 2:    11.00
...etc.

这篇关于正则表达式:重复捕获组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆