通过正则表达式进行不区分大小写的有序词搜索 [英] Case-insenstive ordered word search via regular expression
问题描述
我刚开始使用 perl 中的正则表达式.在浏览了各种在线教程后,我想写一个正则表达式来匹配指定的不区分大小写的单词匹配顺序.
I just started off with regular expression in perl. After playing around through various online tutorials, I wanted a write a regular expression that matches order specified case insensitive word match.
我正在尝试确定字符串A"是否包含一个单词或字符串B"的一系列单词,并且我想不区分大小写.
I'm trying to determine if string "A" consists of a word or a sequence of words of string "B", and I want to do this case-insensitively.
例如,如果字符串B"是John Von Neumann",那么JOhn"、Von NeuMann"、VoN"、john neuMann"将是一个匹配,但字符串像Joh"、NeumaNn VoN"、Vonn"不是匹配项.
For example, if string "B" is "John Von Neumann", then "JOhn", "Von NeumaNn", "VoN", "john neuMann" would be a match, but strings like "Joh", "NeumaNn VoN", "Vonn" would not be a match.
我不知道如何用正则表达式做到这一点,有什么想法吗?
I am not sure how to do this with regular expressions, any idea?
推荐答案
让我们暂时忽略大小写.
Let's ignore case for a second.
John Von Neumann
可以匹配
John Von Neumann 1 1 1
John Von 1 1 0
John Neumann 1 0 1
John 1 0 0
Von Neumann 0 1 1
Von 0 1 0
Neumann 0 0 1
所以你正在寻找的正则表达式模式是
So the regex pattern for which you are looking is
/^(?:John Von Neumann|John Von|John Newmann|John|...)\z/i
以下是构建列表的方法:
Here's how you can build the list:
sub true_indexes {
my ($n) = @_;
my $i = 0;
my @indexes;
while ($n) {
push @indexes, $i if $n & 1;
++$i;
$n >>= 1;
}
return @indexes;
}
my @words = split(' ', 'John Von Neumann');
my @patterns;
unshift @patterns, join ' ', @words[ true_indexes($_) ]
for 1 .. (2**@words)-1;
最后,我们可以生成模式:
And finally, we can generate the pattern:
my $pat = join '|', map quotemeta, @patterns;
my $re = qr/$pat/i;
你会像这样使用它:
if ($input =~ /^$re\z/) {
print "match\n";
} else {
print "no match\n";
}
这篇关于通过正则表达式进行不区分大小写的有序词搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!