正则表达式:查找没有子字符串的字符串 [英] Regular expressions: find string without substring

查看:53
本文介绍了正则表达式:查找没有子字符串的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大文本:

"一大段文字.这句话包含了'regexp'这个词.还有这个句子不包括那个词"

我需要找到以this"开头并以word"结尾的子字符串,但包含单词正则表达式'.

在这种情况下,字符串:this sentence does not include that word"正是我想要接收的.

我如何通过正则表达式做到这一点?

解决方案

使用忽略大小写选项,以下应该可以工作:

\bthis\b(?:(?!\bregexp\b).)*?\bword\b

示例:http://www.rubular.com/r/g6tYcOy8IT

说明:

\bthis\b # 匹配单词'this',\b 用于单词边界(?: # 开始组,重复零次或多次,尽量少(?!\bregexp\b) # 如果 'regexp' 可以匹配则失败(负前瞻).# 匹配任意单个字符)*?# 结束组\bword\b # 匹配 'word'

围绕每个单词的 \b 确保您不匹配子字符串,例如匹配 'thistle' 中的 'this',或匹配 'wordy' 中的 'word'.

通过检查起始词和结束词之间的每个字符来确保排除的词不会出现.

I have a big text:

"Big piece of text. This sentence includes 'regexp' word. And this
sentence doesn't include that word"

I need to find substring that starts by 'this' and ends by 'word' but doesn't include word 'regexp'.

In this case the string: "this sentence doesn't include that word" is exactly what I want to receive.

How can I do this via Regular Expressions?

解决方案

With an ignore case option, the following should work:

\bthis\b(?:(?!\bregexp\b).)*?\bword\b

Example: http://www.rubular.com/r/g6tYcOy8IT

Explanation:

\bthis\b           # match the word 'this', \b is for word boundaries
(?:                # start group, repeated zero or more times, as few as possible
   (?!\bregexp\b)    # fail if 'regexp' can be matched (negative lookahead)
   .                 # match any single character
)*?                # end group
\bword\b           # match 'word'

The \b surrounding each word makes sure that you aren't matching on substrings, like matching the 'this' in 'thistle', or the 'word' in 'wordy'.

This works by checking at each character between your start word and your end word to make sure that the excluded word doesn't occur.

这篇关于正则表达式:查找没有子字符串的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆