什么C#经常EX pression将提取下列内容? [英] What C# regular expression would extract the following content?
本文介绍了什么C#经常EX pression将提取下列内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下内容,
<li id="24">
<span class="button theme">
<span>abc</span>
<div>nanananana</div>
</span>
<section>
content1
</section>
</li>
<li id="25">
<span class="button theme">
<span>xyz</span>
<div>blablabla</div>
</span>
<section>
content2
</section>
</li>
我要像下面的内容,
I want the content like the following,
24 abc content1
25 xyz content2
我怎样写常恩pression在C#中实现这一目标?
How do I write the regular expression to achieve this in C#?
推荐答案
使用一个真正的HTML解析器如 HtmlAgilityPack 解析一个html。正则表达式是没有很好地分析HTML。请参阅<一href="http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags">RegEx符合开放式标签,除了XHTML自包含的代码
Use a real html parser like HtmlAgilityPack to parse an html. Regex is not well for parsing html. See RegEx match open tags except XHTML self-contained tags
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var result = doc.DocumentNode.Descendants("li")
.Select(n => new
{
Id = n.Attributes["id"].Value,
SpanValue = n.Element("span").Element("span").InnerText,
SectionVal = n.Element("section").InnerText.Trim(),
})
.ToList();
这篇关于什么C#经常EX pression将提取下列内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文