这个正则表达式是如何工作的? [英] How does this regular expression work?
问题描述
来自 这篇文章,
/^1?$|^(11+?)1+$/
检查一个数(它的一元值)是否是质数.
/^1?$|^(11+?)1+$/
checks whether a number(its value in unary) is prime or not.
使用这个,perl -l -e '(1 x $_) !~/^1?$|^(11+?)1+$/&&print while ++$_;'
返回素数列表.
Using this, perl -l -e '(1 x $_) !~ /^1?$|^(11+?)1+$/ && print while ++$_;'
returns a list of prime numbers.
我对 Perl 没有足够的经验,但我明白正则表达式对于非质数的数字将 true.所以,如果我们用这个表达式打印所有不产生 true 的数字,我们就会得到一个质数列表.这就是 perl 查询试图做的事情.
I do not have enough experience with Perl, but what I understand is that the regular expression will be true for a number that is not prime. So, if we print all numbers that do not produce a true with this expression, we have a list of prime numbers. Thats what the perl query is trying to do.
关于正则表达式部分,
^1?$
部分用于将 1 计数为 非质数
^1?$
part is for counting 1 as not prime
^(11+?)1+$
用于匹配从 4 开始的非素数.
^(11+?)1+$
is for matching not prime numbers starting from 4.
我不明白为什么需要正则表达式中的 ?
.据我说 /^1$|^(11+)1+$/
应该没问题,实际上
What I do not understand is why is the ?
in the regex needed at all.
According to me /^1$|^(11+)1+$/
should be just fine and actually
perl -l -e '(1 x $_) !~/^1$|^(11+)1+$/&&打印 while ++$_;'
给我一组相同的素数.
我对正则表达式的理解有什么缺陷吗?为什么需要 ?
?
?
不是应该匹配它前面的表达式的零次或一次吗?
Isn't ?
supposed to match zero or one occurrence of the expression preceding it?
推荐答案
第一个 ?
用于将空字符串(即 0)匹配为非素数.如果你不在乎正则表达式是否匹配0,那就没有必要.
The first ?
is for matching the empty string (i.e. 0) as non-prime. If you don't care whether the regexp matches 0, then it's not necessary.
第二个 ?
只是为了效率.+
通常是贪婪的",这意味着它匹配尽可能多的可用字符,然后如果正则表达式的其余部分未能匹配则回溯.+?
使它成为非贪婪的,所以它只匹配 1 个字符,然后如果正则表达式的其余部分未能匹配,则尝试匹配更多.(有关贪婪与非贪婪匹配的更多信息,请参阅 perlre 的量词部分.)
The second ?
is only for efficiency. +
is normally "greedy", which means it matches as many characters as are available, and then backtracks if the rest of the regexp fails to match. The +?
makes it non-greedy, so it matches only 1 character, and then tries matching more if the rest of the regexp fails to match. (See the Quantifiers section of perlre for more about greedy vs. non-greedy matching.)
在这个特定的正则表达式中,(11+?)
表示它测试被 2 ('11'
) 和 3 ('111'
),然后是 4,等等.如果你使用 (11+)
,它会测试被 N(数字本身),然后是 N-1,然后是 N-2 等的整除性.因为除数必须不大于 N/2,如果没有 ?
它将浪费时间测试许多不可能工作的潜在"除数.它仍然会匹配非质数,只是速度更慢.(此外,$1
将是最大的除数而不是最小的除数.)
In this particular regexp, the (11+?)
means it tests divisibility by 2 ('11'
), then 3 ('111'
), then 4, etc. If you used (11+)
, it would test divisibility by N (the number itself), then N-1, then N-2, etc. Since a divisor must be no greater than N/2, without the ?
it would waste time testing a lot of "potential" divisors that can't possibly work. It would still match non-prime numbers, just more slowly. (Also, $1
would be the largest divisor instead of the smallest one.)
这篇关于这个正则表达式是如何工作的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!