如何从 Perl 中的字符串中提取子字符串? [英] How can I extract substrings from a string in Perl?

查看:79
本文介绍了如何从 Perl 中的字符串中提取子字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下字符串:

1) 方案 ID:abc-456-hu5t10(高优先级)*****

1) Scheme ID: abc-456-hu5t10 (High priority) *****

2) 方案 ID:frt-78f-hj542w(平衡)

2) Scheme ID: frt-78f-hj542w (Balanced)

3) 方案 ID:23f-f974-nm54w(超级公式运行)*****

3) Scheme ID: 23f-f974-nm54w (super formula run) *****

以上述格式依此类推 - 粗体部分是字符串之间的变化.

and so on in the above format - the parts in bold are changes across the strings.

==> 想象一下我有很多上面显示的格式字符串.我想从上面的每个字符串中选择 3 个子字符串(如下面的粗体所示).

==> Imagine I've many strings of format Shown above. I want to pick 3 substrings (As shown in BOLD below) from the each of the above strings.

  • 第一个包含字母数字值的子字符串(例如,上面是abc-456-hu5t10")
  • 包含单词的第二个子字符串(例如,上面是高优先级")
  • 第三个包含 * 的子字符串(IF * 出现在字符串的末尾 ELSE 离开它)
  • 1st substring containing the alphanumeric value (in eg above it's "abc-456-hu5t10")
  • 2nd substring containing the word (in eg above it's "High priority")
  • 3rd substring containing * (IF * is present at the end of the string ELSE leave it )

如何从上面显示的每个字符串中选择这 3 个子字符串?我知道可以在 Perl 中使用正则表达式来完成...你能帮忙吗?

How do I pick these 3 substrings from each string shown above? I know it can be done using regular expressions in Perl... Can you help with this?

推荐答案

你可以这样做:

my $data = <<END;
1) Scheme ID: abc-456-hu5t10 (High priority) *
2) Scheme ID: frt-78f-hj542w (Balanced)
3) Scheme ID: 23f-f974-nm54w (super formula run) *
END

foreach (split(/\n/,$data)) {
  $_ =~ /Scheme ID: ([a-z0-9-]+)\s+\(([^)]+)\)\s*(\*)?/ || next;
  my ($id,$word,$star) = ($1,$2,$3);
  print "$id $word $star\n";
}

关键是正则表达式:

Scheme ID: ([a-z0-9-]+)\s+\(([^)]+)\)\s*(\*)?

分解如下.

固定字符串方案ID:":

The fixed String "Scheme ID: ":

Scheme ID: 

后跟一个或多个字符 a-z、0-9 或 -.我们使用括号将其捕获为 $1:

Followed by one or more of the characters a-z, 0-9 or -. We use the brackets to capture it as $1:

([a-z0-9-]+)

后跟一个或多个空白字符:

Followed by one or more whitespace characters:

\s+

后跟一个左括号(我们转义),后跟任意数量的不是右括号的字符,然后是右括号(转义).我们使用未转义的括号将单词捕获为 $2:

Followed by an opening bracket (which we escape) followed by any number of characters which aren't a close bracket, and then a closing bracket (escaped). We use unescaped brackets to capture the words as $2:

\(([^)]+)\)

后跟一些空格,可能是 *,捕获为 $3:

Followed by some spaces any maybe a *, captured as $3:

\s*(\*)?

这篇关于如何从 Perl 中的字符串中提取子字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆