正则表达式匹配以特殊字符开头的单词边界 [英] regex to match word boundary beginning with special characters
问题描述
我有正则表达式可以匹配单词,除非它们包含特殊字符,例如
〜Query,这是C ++类的成员的名称.
对于单个字符的成员名称,需要使用如下所示的单词边界.
$key =~ /\b$match\b/
I have regex that matches words fine except if they contain a special character such as
~Query which is the name of a member of a C++ class.
Need to use word boundary as shown below for member names that are single characters.
$key =~ /\b$match\b/
我尝试了很多我认为可以使用的表达式,例如/[~]*\b$match\b/
或/\b[~]*$match\b/
I tried numerous expressions I thought would work such as /[~]*\b$match\b/
or /\b[~]*$match\b/
是否可以在可能包含特殊字符的单词上设置单词边界?
Is it possible to put a word boundary on words that may contain a special character?
推荐答案
\b
是
(?:(?<!\w)(?=\w)|(?<=\w)(?!\w))
如果要将~
视为单词字符,请将\w
更改为[\w~]
.
If you want to treat ~
as a word character, change \w
to [\w~]
.
(?:(?<![\w~])(?=[\w~])|(?<=[\w~])(?![\w~]))
示例用法:
my $word_char = qr/[\w~]/;
my $boundary = qr/(?<!$word_char)(?=$word_char)
|(?<=$word_char)(?!$word_char)/x;
$key =~ /$boundary$match$boundary/
如果我们知道$match
只能匹配以$word_char
开头和结尾的内容,则可以简化如下:
If we know $match
can only match something that starts and ends with a $word_char
, we can simplify as follows:
my $word_char = qr/[\w~]/;
my $start_bound = qr/(?<!$word_char)/;
my $end_bound = qr/(?!$word_char)/;
$key =~ /$start_bound$match$end_bound/
这很简单,我们可以内联.
This is simple enough that we can inline.
$key =~ /(?<![\w~])$match(?![\w~])/
这篇关于正则表达式匹配以特殊字符开头的单词边界的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!