匹配有限自然数级数 [英] matching finite natural number series

查看:37
本文介绍了匹配有限自然数级数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将有限自然数系列与正则表达式匹配?

How can I match finite natural number series with regex?

所以,要求是:

  • 字符串包含数字和空格(作为分隔符)
  • 第一个数字是 1
  • 每个数字(第一个除外)都等于前一个数字 + 1

应该匹配:

  • 1
  • 1 2
  • 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
  • 从 1 到 10^1000 的长系列结果

不应该匹配:

  • ``
  • 1 3 4
  • 1 2 3 4 5 6 6

除此之外,对正则表达式还有一些要求:

Beside that there are some requirements to regex:

  • 它应该是单一的一次性表达式,而不是指令的循环条件算法包
  • 它可以使用 perl 正则表达式的所有功能
  • it should be single one-shot expression, and not a cycle-condition-algorithm bale of instructions
  • it could use all power of perl regular expressions

我不确定正则表达式实际上是懒惰的,所以如果它们是,那就太好了.因为自然数列在数论中的本义是非有限的.

I'm not sure that regex are actually lazy, so it would be great if they are. Because the natural number series is non-finite in it's original meaning from number theory.

还有最后一个.请注意,我没有在该工作中使用了错误的工具.这根本不是现实世界的编程任务.

And the last one. Please notice, that I'm not using the wrong tool for that job. It's not a real world programming task at all.

推荐答案

给你.在 Perl v5.10 到 v5.14 上测试.关键是递归模式,我们在(?&Sequence)规则上递归.这是一种归纳证明.

Here you go. Tested on Perl v5.10 through v5.14. The key is the recursive pattern, where we recurse on the (?&Sequence) rule. It’s something of a proof by induction.

bigint 是为了防止你真的想从 1 .. 10**10_000 生成一个序列.如果您可以根据平台限制自己使用机器本机整数、32 位或 64 位,它的运行速度会快得多.

The bigint is there just in case you really want to generate a sequence from 1 .. 10**10_000. It will run considerably faster if you can limit yourself to machine native ints, 32-bit or 64-bit depending on your platform.

#!/usr/bin/env perl
use v5.10;
use bigint;  # only if you need stuff over maxint

my $pat = qr{
    ^
    (?= 1 \b )
    (?<Sequence>
        (?<Number> \d+ )
        (?:
            \s+
            (??{  "(?=" . (1 + $+{Number}) . ")" })
            (?&Sequence)
        )?
    )
    $
}x;

# first test embedded data
while (<DATA>) {
    if ( /$pat/ ) {
        print "PASS: ", $_;

    } else {
        print "FAIL: ", $_;
    }
}

# now generate long sequences
for my $big ( 2, 10, 25, 100, 1000, 10_000, 100_000 ) {
    my $str = q();
    for (my $i = 1; $i <= $big; $i++) {
        $str .= "$i ";
    }
    chop $str;
    if ($str =~ $pat) {
        print "PASS: ";
    } else {
        print "FAIL: ";
    }
    if (length($str) > 60) {
        my $len = length($str);
        my $first = substr($str,   0, 10);
        my $last  = substr($str, -10);
        $str = $first . "[$len chars]" . $last;
    }
    say $str;

}


__END__
5
fred
1
1 2 3
1 3 2
1 2 3 4 5
1 2 3 4 6
2 3 4 6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
1 2 3 4 5 6 6

哪个运行产生:

FAIL: 5
FAIL: fred
PASS: 1
PASS: 1 2 3
FAIL: 1 3 2
PASS: 1 2 3 4 5
FAIL: 1 2 3 4 6
FAIL: 2 3 4 6
PASS: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
FAIL: 1 2 3 4 5 6 6
PASS: 1 2
PASS: 1 2 3 4 5 6 7 8 9 10
PASS: 1 2 3 4 5 [65 chars]2 23 24 25
PASS: 1 2 3 4 5 [291 chars] 98 99 100
PASS: 1 2 3 4 5 [3892 chars]8 999 1000
PASS: 1 2 3 4 5 [588894 chars]999 100000

冒着看似自私的风险,有一本书涵盖了这一点那类的东西.请参阅 Programming Perl 4ᵗʰ 版第 5 章中的花式模式"部分.您需要查看有关命名组"、递归模式"和语法模式"的新部分.这本书在打印机上,应该在一两天内以电子方式提供.

At the risk of seeming self-serving, there is a book that covers this sort of thing. See the section on "Fancy Patterns" in Chapter 5 of Programming Perl, 4ᵗʰ edition. You’ll want to check out the new sections on "Named Groups", on "Recursive Patterns", and on "Grammatical Patterns". The book is at the printers, and should be available electronically in a day or two.

这篇关于匹配有限自然数级数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆