匹配 Perl 中第 n 个最长的可能字符串 [英] Match the nth longest possible string in Perl

查看:80
本文介绍了匹配 Perl 中第 n 个最长的可能字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Perl 正则表达式的模式匹配量词是贪婪的"(它们匹配可能的最长字符串).为了强制匹配不贪婪",a ?可以附加到模式量词 (*, +).

The pattern matching quantifiers of a Perl regular expression are "greedy" (they match the longest possible string). To force the match to be "ungreedy", a ? can be appended to the pattern quantifier (*, +).

这是一个例子:

#!/usr/bin/perl

$string="111s11111s";

#-- greedy match
$string =~ /^(.*)s/;
print "$1\n"; # prints 111s11111

#-- ungreedy match
$string =~ /^(.*?)s/;
print "$1\n"; # prints 111

但是如何在 Perl 中找到第二个、第三个和 .. 可能的字符串匹配?举一个简单的例子——如果需要更好的.

But how one can find the second, third and .. possible string match in Perl? Make a simple example of yours --if need a better one.

推荐答案

利用 条件表达式,一个 代码表达式,以及回溯控制动词.

Utilize a conditional expression, a code expression, and backtracking control verbs.

my $skips = 1;
$string =~ /^(.*)s(?(?{$skips-- > 0})(*FAIL))/;

以上将使用贪婪匹配,但会导致最大匹配故意失败.如果您想要第三大,您可以将跳过次数设置为 2.

The above will use greedy matching, but will cause the largest match to intentionally fail. If you wanted the 3rd largest, you could just set the number of skips to 2.

如下所示:

#!/usr/bin/perl
use strict;
use warnings;

my $string = "111s11111s11111s";

$string =~ /^(.*)s/;
print "Greedy match     - $1\n";

$string =~ /^(.*?)s/;
print "Ungreedy match   - $1\n";

my $skips = 1;
$string =~ /^(.*)s(?(?{$skips-- > 0})(*FAIL))/;
print "2nd Greedy match - $1\n";

输出:

Greedy match     - 111s11111s11111
Ungreedy match   - 111
2nd Greedy match - 111s11111

在使用此类高级功能时,充分理解正则表达式以预测结果非常重要.这种特殊情况之所以有效,是因为正则表达式在一端用 ^ 固定.这意味着我们知道每个后续匹配也比前一个短.但是,如果两端都可以移动,我们就不一定能预测顺序.

When using such advanced features, it is important to have a full understanding of regular expressions to predict the results. This particular case works because the regex is fixed on one end with ^. That means that we know that each subsequent match is also one shorter than the previous. However, if both ends could shift, we could not necessarily predict order.

如果是这样,那么您将找到所有这些,然后对它们进行排序:

If that were the case, then you find them all, and then you sort them:

use strict;
use warnings;

my $string = "111s11111s";

my @seqs;
$string =~ /^(.*)s(?{push @seqs, $1})(*FAIL)/;

my @sorted = sort {length $b <=> length $a} @seqs;

use Data::Dump;
dd @sorted;

输出:

("111s11111s11111", "111s11111", 111)

v5.18 之前的 Perl 版本注意事项

Perl v5.18 引入了一个变化,/(?{})//(??{})/已经过大量返工,这使得词法变量的范围能够在上面使用的代码表达式中正常工作.在此之前,上面的代码会导致以下错误,如此子程序版本在 v5.16.2 下运行所示:

Note for Perl versions prior to v5.18

Perl v5.18 introduced a change, /(?{})/ and /(??{})/ have been heavily reworked, that enabled the scope of lexical variables to work properly in code expressions as utilized above. Before then, the above code would result in the following errors, as demonstrated in this subroutine version run under v5.16.2:

Variable "$skips" will not stay shared at (re_eval 1) line 1.
Variable "@seqs" will not stay shared at (re_eval 2) line 1.

RE 代码表达式的旧实现的修复是使用 our<声明变量/code>,为了进一步的良好编码实践,localize 初始化时.这在这个在 v5.16.2 下运行的修改子程序版本 中得到了证明,或者如下所示:

The fix for older implementations of RE code expressions is to declare the variables with our, and for further good coding practices, to localize them when initialized. This is demonstrated in this modified subroutine version run under v5.16.2, or as put below:

local our @seqs;
$string =~ /^(.*)s(?{push @seqs, $1})(*FAIL)/;

这篇关于匹配 Perl 中第 n 个最长的可能字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆