HtmlAgilityPack并选择节点和子节点 [英] HtmlAgilityPack and selecting Nodes and Subnodes
问题描述
希望有人可以帮助我.
假设我有一个HTML文档,其中包含多个div,例如以下示例:
Let´s say I have a html document that contains multiple divs like this example:
<div class="search_hit">
<span prop="name">Richard Winchester</span>
<span prop="company">Kodak</span>
<span prop="street">Arlington Road 1</span>
</div>
<div class="search_hit">
<span prop="name">Ted Mosby</span>
<span prop="company">HP</span>
<span prop="street">Arlington Road 2</span>
</div>
我正在使用HtmlAgilityPack获取html文档.我需要知道的是如何获取每个"search_hit" -div的跨度?
I´m using HtmlAgilityPack to get the html document. What i need to know is how can i get the spans for each "search_hit"-div?
我的第一个想法是这样的:
My first thought was something like this:
foreach (HtmlAgilityPack.HtmlNode node in doc.DocumentNode.SelectNodes("//div[@class='search_hit']"))
{
foreach (HtmlAgilityPack.HtmlNode node2 in node.SelectNodes("//span[@prop]"))
{
}
}
每个div都应该是一个对象,并具有包含的跨度作为属性. I. e.
Each div should be a object with the included spans as properties. I. e.
public class Record
{
public string Name { get; set; }
public string company { get; set; }
public string street { get; set; }
}
然后应填写此列表:
public List<Record> Results = new List<Record>();
但是我正在使用的XPATH没有像应该那样在子节点中进行搜索.它接缝是它一次又一次地搜索整个文档.
But the XPATH i´m using is not doing a search in the subnode as it should do. It seams that it searches the whole document again and again.
我的意思是我已经以这种方式工作了,即我只获得了整个页面的跨度.但是然后我没有跨度和div之间的关系.意思是:我不知道哪个范围与哪个div有关.
I mean I already got it working in that way that i just get the the spans of the whole page. But then i have no relation between the spans and divs. Means: I don´t know anymore which span is related to which div.
有人知道解决方案吗?我已经玩了这么多,我现在完全感到困惑了:)
Does somebody know a solution? I already played around that much that i´m totally confused now :)
感谢您的帮助!
推荐答案
以下内容对我有用.正如BeniBela指出的那样,重要的一点是在第二次调用"SelectNodes"时添加了一个点.
The following works for me. The important bit is just as BeniBela noted to add a dot in second call to 'SelectNodes'.
List<Record> lstRecords=new List<Record>();
foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//div[@class='search_hit']"))
{
Record record=new Record();
foreach (HtmlNode node2 in node.SelectNodes(".//span[@prop]"))
{
string attributeValue = node2.GetAttributeValue("prop", "");
if (attributeValue == "name")
{
record.Name = node2.InnerText;
}
else if (attributeValue == "company")
{
record.company = node2.InnerText;
}
else if (attributeValue == "street")
{
record.street = node2.InnerText;
}
}
lstRecords.Add(record);
}
这篇关于HtmlAgilityPack并选择节点和子节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!