Perl中的正则表达式组:如何从正则表达式组中捕获与字符串中未知数量/多个/变量匹配项匹配的元素? [英] Regex Group in Perl: how to capture elements into array from regex group that matches unknown number of/multiple/variable occurrences from a string?

查看:135
本文介绍了Perl中的正则表达式组:如何从正则表达式组中捕获与字符串中未知数量/多个/变量匹配项匹配的元素?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Perl中,如何使用一个正则表达式分组将多个匹配的正则表达式捕获到多个数组元素中?

In Perl, how can I use one regex grouping to capture more than one occurrence that matches it, into several array elements?

例如,对于一个字符串:

For example, for a string:

var1=100 var2=90 var5=hello var3="a, b, c" var7=test var3=hello

使用代码处理此问题:

$string = "var1=100 var2=90 var5=hello var3=\"a, b, c\" var7=test var3=hello";

my @array = $string =~ <regular expression here>

for ( my $i = 0; $i < scalar( @array ); $i++ )
{
  print $i.": ".$array[$i]."\n";
}

我希望将其视为输出:

0: var1=100
1: var2=90
2: var5=hello
3: var3="a, b, c"
4: var7=test
5: var3=hello

我将使用什么作为正则表达式?

What would I use as a regex?

我要在此处匹配的事物之间的共性是赋值字符串模式,如下所示:

The commonality between things I want to match here is an assignment string pattern, so something like:

my @array = $string =~ m/(\w+=[\w\"\,\s]+)*/;

其中*表示与该组匹配的一个或多个事件.

Where the * indicates one or more occurrences matching the group.

(我使用split()进行了打折,因为某些匹配项在其内部包含空格(即var3 ...),因此不会给出期望的结果.)

(I discounted using a split() as some matches contain spaces within themselves (i.e. var3...) and would therefore not give desired results.)

使用上述正则表达式,我只会得到:

With the above regex, I only get:

0: var1=100 var2

是否可以在正则表达式中使用?还是需要附加代码?

Is it possible in a regex? Or addition code required?

在搜索"perl regex多个组"时已经查找了现有答案,但没有足够的线索:

Looked at existing answers already, when searching for "perl regex multiple group" but not enough clues:

  • Dealing with multiple capture groups in multiple records
  • Multiple matches within a regex group?
  • Regex: Repeated capturing groups
  • Regex match and grouping
  • How do I regex match with grouping with unknown number of groups
  • awk extract multiple groups from each line
  • Matching multiple regex groups and removing them
  • Perl: Deleting multiple reccuring lines where a certain criterion is met
  • Regex matching into multiple groups per line?
  • PHP RegEx Grouping Multiple Matches
  • How to find multiple occurrences with regex groups?

推荐答案

my $string = "var1=100 var2=90 var5=hello var3=\"a, b, c\" var7=test var3=hello";

while($string =~ /(?:^|\s+)(\S+)\s*=\s*("[^"]*"|\S*)/g) {
        print "<$1> => <$2>\n";
}

打印:

<var1> => <100>
<var2> => <90>
<var5> => <hello>
<var3> => <"a, b, c">
<var7> => <test>
<var3> => <hello>

说明:

最后一首:结尾处的g标志意味着您可以多次将正则表达式应用于字符串.第二次它将继续匹配,最后一个匹配在字符串中结束的地方.

Last piece first: the g flag at the end means that you can apply the regex to the string multiple times. The second time it will continue matching where the last match ended in the string.

现在是正则表达式:(?:^|\s+)匹配字符串的开头或一组一个或多个空格.这是必需的,因此当下次应用正则表达式时,我们将跳过键/值对之间的空格. ?:表示括号内容不会被捕获为组(我们不需要空格,只需要键和值). \S+与变量名称匹配.然后,我们跳过任何数量的空格和介于两者之间的等号.最后,("[^"]*"|\S*)/匹配两个引号之间的任意数量的字符,或者匹配该值的任意数量的非空格字符.请注意,引号匹配非常脆弱,并且无法正确处理escpaped引号,例如"\"quoted\""将导致"\".

Now for the regex: (?:^|\s+) matches either the beginning of the string or a group of one or more spaces. This is needed so when the regex is applied next time, we will skip the spaces between the key/value pairs. The ?: means that the parentheses content won't be captured as group (we don't need the spaces, only key and value). \S+ matches the variable name. Then we skip any amount of spaces and an equal sign in between. Finally, ("[^"]*"|\S*)/ matches either two quotes with any amount of characters in between, or any amount of non-space characters for the value. Note that the quote matching is pretty fragile and won't handle escpaped quotes properly, e.g. "\"quoted\"" would result in "\".

由于您确实要获取整个分配,而不是单个键/值,因此这里有一个单行代码可以提取这些键/值:

Since you really want to get the whole assignment, and not the single keys/values, here's a one-liner that extracts those:

my @list = $string =~ /(?:^|\s+)((?:\S+)\s*=\s*(?:"[^"]*"|\S*))/g;

这篇关于Perl中的正则表达式组:如何从正则表达式组中捕获与字符串中未知数量/多个/变量匹配项匹配的元素?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆