在 Perl 6 中使用正则表达式和 .contains 进行过滤 [英] Filtering with regex and .contains in Perl 6

查看:36
本文介绍了在 Perl 6 中使用正则表达式和 .contains 进行过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常需要过滤包含一些子字符串(例如一个字符)的字符串 array 的元素.由于它可以通过匹配 regex.contains 方法来完成,我决定做一个小测试以确保 .contains 更快(因此更合适).

I often have to filter elements of an array of strings, containing some substring (e.g. one character). Since it can be done either by matching a regex or with .contains method, I've decided to make a small test to be sure that .contains is faster (and therefore more appropriate).

my @array = "aa" .. "cc";
my constant $substr = 'a';

my $time1 = now;
my @a_array = @array.grep: *.contains($substr);
my $time2 = now;
@a_array = @array.grep: * ~~ /$substr/;
my $time3 = now;

my $time_contains = $time2 - $time1;
my $time_regex    = $time3 - $time2;
say "contains: $time_contains sec";
say "regex:    $time_regex sec";

然后我改变@array 的大小和$substr 的长度,并比较每种方法过滤@array 的时间>.在大多数情况下(正如预期的那样),.containsregex 快得多,尤其是当 @array 很大时.但是对于小的 @array(如上面的代码)regex 稍微快一些.

Then I change the size of @array and the length of $substr and compare the times which each method took to filter the @array. In most cases (as expected), .contains is much faster than regex, especially if @array is large. But in case of a small @array (as in the code above) regex is slightly faster.

contains: 0.0015010 sec
regex:    0.0008708 sec

为什么会发生这种情况?

Why does this happen?

推荐答案

在一个完全不科学的实验中,我刚刚切换了 regex 版本和 contains 版本,发现您测量的性能差异不是regex vs contains" 但实际上第一件事对第二件事":

In an entirely unscientific experiment I just switched the regex version and the contains version around and found that the difference in performance you're measuring is not "regex vs contains" but in fact "first thing versus second thing":

何时先包含:

contains: 0.001555  sec
regex:    0.0009051 sec

当正则表达式优先时:

regex:    0.002055 sec
contains: 0.000326 sec

正确地进行基准测试是一项艰巨的任务.很容易不小心测量到与您想弄清楚的不同的东西.

Benchmarking properly is a difficult task. It's really easy to accidentally measure something different from what you wanted to figure out.

当我想比较多个事物的性能时,我通常会在单独的脚本中运行每个事物,或者可能有一个共享脚本但一次只运行一个任务(例如使用 multi sub MAIN(task1") 方法).这样一来,任何启动工作都可以共享.

When I want to compare the performance of multiple things I will usually run each thing in a separate script, or maybe have a shared script but only run one of the tasks at once (for example using a multi sub MAIN("task1") approach). That way any startup work gets shared.

在 freenode 的 #perl6 IRC 频道中,我们有一个名为 benchable6 的机器人,它可以为您做基准测试.阅读维基页面上的比较代码"部分以了解它如何为您比较两段代码.

In the #perl6 IRC channel on freenode we have a bot called benchable6 which can do benchmarks for you. Read the section "Comparing Code" on its wiki page to find out how it can compare two pieces of code for you.

这篇关于在 Perl 6 中使用正则表达式和 .contains 进行过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆