C#不包括字符串正则表达式 [英] C# Regular Expression excluding a string
问题描述
我得到的字符串的集合,所有我想要的正则表达式是收集所有开始以http ..
HREF =HTTP ://www.test.com/cat/1-one_piece_episodes/的href =http://www.test.com/cat/2-movies_english_subbed/的href =http://www.test.com/cat / 3-english_dubbed /HREF =http://www.exclude.com
块引用>
这是我的正则表达式模式..
的href =(。*?)[^#]
和返回该
HREF =HTTP:// www.test.com/cat/1-one_piece_episodes/
HREF =http://www.test.com/cat/2-movies_english_subbed/
HREF =HTTP:// WWW。 xxxx.com/cat/3-english_dubbed/
HREF =http://www.exclude.com
什么是具有在排除在最后一场比赛..或排除匹配模式排除域内像HREF =http://www.exclude.com
编辑:
多个排除HREF =((:(?!?| \bexclude\b |。\bxxxx\b))*)[^#]
解决方案@ridgerunner和我将正则表达式更改为:
HREF =((?:( ?! \bexclude\b)[^])*)[^#]
它匹配所有的
的href
属性,只要他们不结束#
,并且不包含字排除
。
说明:
的href =#匹配HREF =
(#捕捉...
(?:#以下组:
(?! #展望检查字符串的下一部分是不是... ...
\b#整个单词
#排除排除
\b#(\b是字边界锚)
)的前瞻
[^]#如果成功#结束,匹配任何字符,除了报价
)*#经常重复地
)#结束捕获组1
[^#]的#匹配一个非#字符,收盘报价。
要允许多个禁词:
HREF =((:( ?! \b(?:?排除|它太)\b)[^])*)[^#]
I got a collection of string and all i want for regex is to collect all started with http..
href="http://www.test.com/cat/1-one_piece_episodes/"href="http://www.test.com/cat/2-movies_english_subbed/"href="http://www.test.com/cat/3-english_dubbed/"href="http://www.exclude.com"
this is my regular expression pattern..
href="(.*?)[^#]"
and return this
href="http://www.test.com/cat/1-one_piece_episodes/" href="http://www.test.com/cat/2-movies_english_subbed/" href="http://www.xxxx.com/cat/3-english_dubbed/" href="http://www.exclude.com"
what is the pattern for excluding the last match.. or excluding matches that has the exclude domain inside like href="http://www.exclude.com"
EDIT: for multiple exclusion
href="((?:(?!"|\bexclude\b|\bxxxx\b).)*)[^#]"
解决方案@ridgerunner and me would change the regex to:
href="((?:(?!\bexclude\b)[^"])*)[^#]"
It matches all
href
attributes as long as they don't end in#
and don't contain the wordexclude
.Explanation:
href=" # Match href=" ( # Capture... (?: # the following group: (?! # Look ahead to check that the next part of the string isn't... \b # the entire word exclude # exclude \b # (\b are word boundary anchors) ) # End of lookahead [^"] # If successful, match any character except for a quote )* # Repeat as often as possible ) # End of capturing group 1 [^#]" # Match a non-# character and the closing quote.
To allow multiple "forbidden words":
href="((?:(?!\b(?:exclude|this|too)\b)[^"])*)[^#]"
这篇关于C#不包括字符串正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!