正则表达式使用单词边界,但单词以。结尾。 (期) [英] Regex using word boundary but word ends with a . (period)

查看:262
本文介绍了正则表达式使用单词边界,但单词以。结尾。 (期)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想要匹配单词 iv 不区分大小写

want to match word i.v. case insensitive

具有模式

(?i)\bi\.v\.

但要在末尾使用单词边界

上面的模式失败了它匹配

ivx

but want a word boundary on the end
the above pattern fails in that it matches
i.v.x

但是如果我尝试在最后添加工作边界

but if I try and add a work boundary to the end

(?i)\bi\.v\.\b

它失败了,因为它甚至不匹配iv
,因为我认为 \b 正在吃字面量。作为。是一个断字

需要 \。要贪婪

it fails in that it does not even match i.v. as I think the \b is eating the literal . as . is a word break
need the \. to be greedy

我想match

sam iv sam

i want to match
sam i.v. sam

不想匹配

sam.iv

ivsam

do not want to match
sam.i.v.
i.v.sam

距离更近

(?i)\bi\.v\.\s$

但找不到iv在行尾

推荐答案

\b 仅在字母数字字符和非字母数字字符(或字符串的开头/结尾)。因此,它与之后不匹配,除非该点后紧跟字母数字字符。

\b only matches between an alphanumeric character and a non-alphanumeric character (or the start/end of string). Therefore, it doesn't match after a ., unless an alphanumeric character immediately follows that dot.

如果您的目的是确保在点后没有非空格字符,然后可以使用否定超前断言

If your intent is to make sure that no non-whitespace character follows after the dot, then you can specify that using a negative lookahead assertion:

(?i)\bi\.v\.(?!\S)

(?! \S)的意思是断言下一个字符不是非空白字符。

(?!\S) means "Assert that the next character is not a non-whitespace character".

这听起来有些令人费解-为什么双重否定?为什么不(?= \s)表示断言下一个字符是空白字符?好吧,这里有一个细微的差别:第二个版本要求空格字符存在;这意味着正则表达式将无法在字符串末尾匹配。第一个正则表达式也处理这种情况。

This may sound a bit convoluted - why the double negative? Why not (?=\s) which means "Assert that the next character is a whitespace character"? Well, there is a subtle difference: The second version requires a whitespace character to be there; that means the regex would fail to match at the end of the string. The first regex handles that corner case as well.

如果您通常希望单词边界的概念表示空格分隔,那么您需要替换第一个 \b

If you generally want the concept of "word boundary" to mean "space-delimited", then you need to replace the first \b as well:

(?i)(?<!\S)i\.v\.(?!\S)

或正则表达式将匹配您似乎不想要的 sam.iv

or the regex will match sam.i.v. which you don't seem to want it to.

这篇关于正则表达式使用单词边界,但单词以。结尾。 (期)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆