正则表达式和否定整个字符组 [英] Regular Expressions and negating a whole character group

查看:40
本文介绍了正则表达式和否定整个字符组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试一些我认为对我来说应该相当明显的事情,但事实并非如此.我正在尝试匹配一个不包含特定字符序列的字符串.我试过使用 [^ab][^(ab)] 等来匹配不包含 'a's 或 'b's,或仅包含 'a's 或仅的字符串'b's 或 'ba' 但与 'ab' 不匹配.我给出的例子不会匹配ab",这是真的,但它们也不会单独匹配a",我需要它们.有什么简单的方法可以做到这一点吗?

I'm attempting something which I feel should be fairly obvious to me but it's not. I'm trying to match a string which does NOT contain a specific sequence of characters. I've tried using [^ab], [^(ab)], etc. to match strings containing no 'a's or 'b's, or only 'a's or only 'b's or 'ba' but not match on 'ab'. The examples I gave won't match 'ab' it's true but they also won't match 'a' alone and I need them to. Is there some simple way to do this?

推荐答案

使用否定前瞻:

^(?!.*ab).*$

更新:在下面的评论中,我表示这种方法比 Peter 的回答中给出的方法要慢.从那以后我进行了一些测试,发现它确实稍微快了一点.然而,更喜欢这种技术的原因不是速度,而是简单性.

UPDATE: In the comments below, I stated that this approach is slower than the one given in Peter's answer. I've run some tests since then, and found that it's really slightly faster. However, the reason to prefer this technique over the other is not speed, but simplicity.

另一种技术,在此处描述为调节贪婪令牌,适用于更复杂的问题,例如匹配分隔符由多个字符组成的分隔文本(如 HTML,正如 Luke 评论的 下面).对于问题中描述的问题,它是矫枉过正的.

The other technique, described here as a tempered greedy token, is suitable for more complex problems, like matching delimited text where the delimiters consist of multiple characters (like HTML, as Luke commented below). For the problem described in the question, it's overkill.

对于任何感兴趣的人,我用一大块 Lorem Ipsum 文本进行了测试,计算了不包含quo"一词的行数.这些是我使用的正则表达式:

For anyone who's interested, I tested with a large chunk of Lorem Ipsum text, counting the number of lines that don't contain the word "quo". These are the regexes I used:

(?m)^(?!.*\bquo\b).+$

(?m)^(?:(?!\bquo\b).)+$

无论是在整个文本中搜索匹配项,还是将其分成几行并单独匹配,锚定前瞻始终优于浮动前瞻.

Whether I search for matches in the whole text, or break it up into lines and match them individually, the anchored lookahead consistently outperforms the floating one.

这篇关于正则表达式和否定整个字符组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆