\w和\b正则表达式元字符之间的区别 [英] Difference between \w and \b regular expression meta characters
问题描述
任何人都可以解释 \w
和 \b
正则表达式元字符之间的区别吗?
Can anyone explain the difference between \w
and \b
regular expression meta-characters?
据我所知,这两个元字符都用于字边界。除此之外,哪个元字符对多语言内容有效?
It is my understanding that both these meta-characters are used for word boundaries. Apart from this, which meta character is efficient for multi lingual content?
推荐答案
元字符 \ b
是一个像插入符号和美元符号的锚点。它匹配的位置称为单词边界。此匹配为零长度。
The metacharacter \b
is an anchor like the caret and the dollar sign. It matches at a position that is called a "word boundary". This match is zero-length.
有三种不同的职位符合字界限:
There are three different positions that qualify as word boundaries:
- 在字符串中的第一个字符之前,如果第一个字符是
a字符。 - 字符串中的最后一个字符后,如果
最后一个字符是单词字符。 -
字符串中的两个字符之间,其中一个是单词字符,另一个不是单词字符。
简单地说: \ b
允许你执行 仅限整个单词使用 \bword \ b
形式的正则表达式进行搜索。 单词字符是可用于形成单词的字符。所有不是单词字符的字符都是非单词字符。
Simply put: \b
allows you to perform a "whole words only" search using a regular expression in the form of \bword\b
. A "word character" is a character that can be used to form words. All characters that are not "word characters" are "non-word characters".
在各种口味中,字符 [a-zA-Z0-9 _]
是单词字符。这些也与短手字符类 \w
匹配。在风味比较中为单词边界显示ascii的风味仅将这些标识为单词字符。
In all flavors, the characters [a-zA-Z0-9_]
are word characters. These are also matched by the short-hand character class \w
. Flavors showing "ascii" for word boundaries in the flavor comparison recognize only these as word characters.
\ w
代表单词字符,通常为 [A-Za-z0-9 _]
。请注意包含下划线和数字。
\w
stands for "word character", usually [A-Za-z0-9_]
. Notice the inclusion of the underscore and digits.
\B
是 \b
。 \B
匹配 \b
的每个位置。实际上, \B
匹配两个单词字符之间的任何位置以及两个非单词字符之间的任何位置。
\B
is the negated version of \b
. \B
matches at every position where \b
does not. Effectively, \B
matches at any position between two word characters as well as at any position between two non-word characters.
\ W
是 [^ \w]
的缩写,是<$ c的否定版本$ c> \w 。
这篇关于\w和\b正则表达式元字符之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!