如何解析HTML节点的属性 [英] How to parse an HTML node's attributes
问题描述
我使用C#并需要解析HTML以将属性读取到键值对中。
例如给出以下HTML代码段
I use C# and need to parse an HTML to read the attributes into key value pairs.
e.g given the following HTML snippet
<DIV myAttribute style="BORDER-BOTTOM: medium none; BACKGROUND-COLOR: transparent; BORDER-TOP: medium none" id=my_ID anotherAttribNamedDIV class="someclass">
请注意,属性可以是
1. key =value例如 class =someclass
2. key =值对例如 id = my_ID
(没有值的引号)
3.没有值的简单属性例如 myAttribute
Please note that the attributes can be
1. key="value" pairs e.g class="someclass"
2. key=value pairs e.g id=my_ID
(no quotes for values)
3. plain attributes e.g myAttribute
, which doesn't have a "value"
我需要将它们存储到带有键值对的字典中,如下所示:
key = myAttribute value =
key = style value =BORDER-BOTTOM:medium none; BACKGROUND-COLOR:transparent; BORDER-TOP:medium none
key = id value =my_ID
key = anotherAttribNamedDIV value =
key = class value =someclass
I need to store them into a dictionary with key value pairs as follows
key=myAttribute value=""
key=style value="BORDER-BOTTOM: medium none; BACKGROUND-COLOR: transparent; BORDER-TOP: medium none"
key=id value="my_ID"
key=anotherAttribNamedDIV value=""
key=class value="someclass"
我是寻找正则表达式来做到这一点。
I am looking for regular expressions to do this.
推荐答案
您可以使用 HtmlAgilityPack
string myDiv = @"<DIV myAttribute style=""BORDER-BOTTOM: medium none; BACKGROUND-COLOR: transparent; BORDER-TOP: medium none"" id=my_ID anotherAttribNamedDIV class=""someclass""></DIV>";
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(myDiv);
HtmlNode node = doc.DocumentNode.SelectSingleNode("div");
Literal1.Text = "";
foreach (HtmlAttribute attr in node.Attributes)
{
Literal1.Text += attr.Name + ": " + attr.Value + "<br />";
}
这篇关于如何解析HTML节点的属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!