正则表达式选项匹配多行并忽略大小写 [英] Regex options matching multi-line as well as ignoring the case
问题描述
我有一些格式不正确的html,有时"丢失了.此外,它有时会显示大写字母,而有时会显示小写字母:
I have some piece of ill-formed html, sometimes the " is missing. Also, it sometimes shows capital cases while other times lower cases:
<DIV class="main">
<DIV class="subsection1">
<H2>
<DIV class=subwithoutquote>StackOverflow</DIV></H2></DIV></DIV>
我想同时匹配多行并忽略大小写.但是以下模式似乎不起作用. (对于串联,我也尝试了|而不是&)
I would like to match both multi-line and ignoring the case. But the following patern does not seem to be working. (For the concatenation, I also tried | instead of &)
const string pattern = @"<div class=""?main""?><div class=""?subsection1""?><h2><div class=""?subwithoutquote""?>(.+?)</div>";
Match m = Regex.Match(html, pattern, RegexOptions.IgnoreCase & RegexOptions.Singleline);
还是应该在模式中添加\ n *来解决多行问题?
Or should I add \n* in the pattern to solve multi-line issue?
推荐答案
第一个问题是,制表符之间的正则表达式中不允许使用空格.正确的正则表达式(经过Rubular测试)是:
The first problem is that you are not allowing for white-space in your regex between tabs. The correct regex (tested in Rubular) is:
<div class=""?main""?>\s*<div class=""?subsection1""?>\s*<h2>\s*<div class=\"?subwithoutquote\"?>(.+?)<\/div>\s*
请注意添加了几个\s*
条目.
Notice the addition of several \s*
entries.
第二个问题是您没有正确连接选项.
The second problem is that you're not concatenating the options properly.
您的代码:
Match m = Regex.Match(html, pattern, RegexOptions.IgnoreCase & RegexOptions.Singleline);
因为这些是位标志,所以按位与(&
运算符)是错误的标志.您想要的是按位或(|
运算符).
Since these are bit flags, Bitwise-And (&
operator) is a wrong flag. What you want is Bitwise-Or (|
operator).
按位运算符-表示如果在其中的 中都设置了该位,请将其保留;否则,将其取消设置.您需要按位运算或,这意味着"如果该位在其中的 中设置,请进行设置;否则,请取消设置."
Bitwise-And means "if the bit is set in both of these, leave it set; otherwise, unset it. You need Bitwise-Or, which means "if the bit is set in either of these, set it; otherwise, unset it."
这篇关于正则表达式选项匹配多行并忽略大小写的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!