C#不包括字符串正则表达式 [英] C# Regular Expression excluding a string

查看:214
本文介绍了C#不包括字符串正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我得到的字符串的集合,所有我想要的正则表达式是收集所有开始以http ..




HREF =HTTP ://www.test.com/cat/1-one_piece_episodes/的href =http://www.test.com/cat/2-movies_english_subbed/的href =http://www.test.com/cat / 3-english_dubbed /HREF =http://www.exclude.com




这是我的正则表达式模式..

 的href =(。*?)[^#]

和返回该

  HREF =HTTP:// www.test.com/cat/1-one_piece_episodes/
HREF =http://www.test.com/cat/2-movies_english_subbed/
HREF =HTTP:// WWW。 xxxx.com/cat/3-english_dubbed/
HREF =http://www.exclude.com

什么是具有在排除在最后一场比赛..或排除匹配模式排除域内像HREF =htt​​p://www.exclude.com



编辑:
多个排除

  HREF =((:(?!?| \bexclude\b |。\bxxxx\b))*)[^#]


解决方案

@ridgerunner和我将正则表达式更改为:

  HREF =((?:( ?! \bexclude\b)[^])*)[^#]

它匹配所有的的href 属性,只要他们不结束,并且不包含字排除



说明:

 的href =#匹配HREF =
(#捕捉...
(?:#以下组:
(?! #展望检查字符串的下一部分是不是... ...
\b#整个单词
#排除排除
\b#(\b是字边界锚)
)的前瞻
[^]#如果成功#结束,匹配任何字符,除了报价
)*#经常重复地
)#结束捕获组1
[^#]的#匹配一个非#字符,收盘报价。

要允许多个禁词:

  HREF =((:( ?! \b(?:?排除|它太)\b)[^])*)[^#]


I got a collection of string and all i want for regex is to collect all started with http..

href="http://www.test.com/cat/1-one_piece_episodes/"href="http://www.test.com/cat/2-movies_english_subbed/"href="http://www.test.com/cat/3-english_dubbed/"href="http://www.exclude.com"

this is my regular expression pattern..

href="(.*?)[^#]"

and return this

href="http://www.test.com/cat/1-one_piece_episodes/"
href="http://www.test.com/cat/2-movies_english_subbed/"
href="http://www.xxxx.com/cat/3-english_dubbed/"
href="http://www.exclude.com"

what is the pattern for excluding the last match.. or excluding matches that has the exclude domain inside like href="http://www.exclude.com"

EDIT: for multiple exclusion

href="((?:(?!"|\bexclude\b|\bxxxx\b).)*)[^#]"

解决方案

@ridgerunner and me would change the regex to:

href="((?:(?!\bexclude\b)[^"])*)[^#]"

It matches all href attributes as long as they don't end in # and don't contain the word exclude.

Explanation:

href="     # Match href="
(          # Capture...
 (?:       # the following group:
  (?!      # Look ahead to check that the next part of the string isn't...
   \b      # the entire word
   exclude # exclude
   \b      # (\b are word boundary anchors)
  )        # End of lookahead
  [^"]     # If successful, match any character except for a quote
 )*        # Repeat as often as possible
)          # End of capturing group 1
[^#]"      # Match a non-# character and the closing quote.

To allow multiple "forbidden words":

href="((?:(?!\b(?:exclude|this|too)\b)[^"])*)[^#]"

这篇关于C#不包括字符串正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆