这个正则表达式如何工作? [英] How does this regular expression work?
问题描述
来自这篇文章,
/^1?$|^(11+?)\1+$/
检查一个数字(其一元值)是否为质数.
/^1?$|^(11+?)\1+$/
checks whether a number(its value in unary) is prime or not.
使用perl -l -e '(1 x $_) !~ /^1?$|^(11+?)\1+$/ && print while ++$_;'
返回素数列表.
我没有足够的Perl经验,但是我了解的是,对于不是质数的数字,正则表达式将为 true .因此,如果我们用该表达式打印所有不会产生 true 的数字,则将有一个质数列表.那就是perl查询想要做的事情.
I do not have enough experience with Perl, but what I understand is that the regular expression will be true for a number that is not prime. So, if we print all numbers that do not produce a true with this expression, we have a list of prime numbers. Thats what the perl query is trying to do.
关于正则表达式部分,
^1?$
部分用于将1算作非素数
^(11+?)\1+$
用于匹配从4开始的非素数.
^(11+?)\1+$
is for matching not prime numbers starting from 4.
我不明白的是,为什么根本需要正则表达式中的?
.
根据我的说法,/^1$|^(11+)\1+$/
应该很好,实际上
What I do not understand is why is the ?
in the regex needed at all.
According to me /^1$|^(11+)\1+$/
should be just fine and actually
perl -l -e '(1 x $_) !~ /^1$|^(11+)\1+$/ && print while ++$_;'
给我相同的素数集.
perl -l -e '(1 x $_) !~ /^1$|^(11+)\1+$/ && print while ++$_;'
gives me the same set of prime numbers.
我对正则表达式的理解是否存在缺陷?为什么需要?
?
Is there any flaw in my understanding of the regular expression? Why are the ?
s needed?
?
不应该与它前面的表达式的零个或一次匹配吗?
Isn't ?
supposed to match zero or one occurrence of the expression preceding it?
推荐答案
第一个?
用于将空字符串(即0)匹配为非素数.如果您不关心正则表达式是否与0匹配,则没有必要.
The first ?
is for matching the empty string (i.e. 0) as non-prime. If you don't care whether the regexp matches 0, then it's not necessary.
第二个?
仅用于提高效率. +
通常是贪婪的",这意味着它会匹配可用的尽可能多的字符,如果其余的regexp不能匹配,则返回. +?
使它变得非贪婪,因此它仅匹配1个字符,如果其余regexp未能匹配,则尝试匹配更多字符. (有关贪婪匹配和非贪婪匹配的更多信息,请参见 perlre的量词"部分.)
The second ?
is only for efficiency. +
is normally "greedy", which means it matches as many characters as are available, and then backtracks if the rest of the regexp fails to match. The +?
makes it non-greedy, so it matches only 1 character, and then tries matching more if the rest of the regexp fails to match. (See the Quantifiers section of perlre for more about greedy vs. non-greedy matching.)
在此特定的正则表达式中,(11+?)
表示它将按2('11'
),然后是3('111'
),然后是4,依此类推.如果使用(11+)
,它将按N来测试可除性. (数字本身),然后是N-1,然后是N-2,等等.由于除数必须不大于N/2,如果没有?
,则浪费大量时间来测试许多潜在"除数可以可能会工作.它仍然会匹配非素数,只是速度较慢. (此外,$1
将是最大的除数,而不是最小的除数.)
In this particular regexp, the (11+?)
means it tests divisibility by 2 ('11'
), then 3 ('111'
), then 4, etc. If you used (11+)
, it would test divisibility by N (the number itself), then N-1, then N-2, etc. Since a divisor must be no greater than N/2, without the ?
it would waste time testing a lot of "potential" divisors that can't possibly work. It would still match non-prime numbers, just more slowly. (Also, $1
would be the largest divisor instead of the smallest one.)
这篇关于这个正则表达式如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!