使用解析HTML文档HtmlAgilityPack [英] Parse html document using HtmlAgilityPack
本文介绍了使用解析HTML文档HtmlAgilityPack的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我试图解析以下HTML通过HtmlAgilityPack片段:
I'm trying to parse the following html snippet via HtmlAgilityPack:
<td bgcolor="silver" width="50%" valign="top">
<table bgcolor="silver" style="font-size: 90%" border="0" cellpadding="2" cellspacing="0"
width="100%">
<tr bgcolor="#003366">
<td>
<font color="white">Info
</td>
<td>
<font color="white">
<center>Price
</td>
<td align="right">
<font color="white">Hourly
</td>
</tr>
<tr>
<td>
<a href='test1.cgi?type=1'>Bookbags</a>
</td>
<td>
$156.42
</td>
<td align="right">
<font color="green">0.11%</font>
</td>
</tr>
<tr>
<td>
<a href='test2.cgi?type=2'>Jeans</a>
</td>
<td>
$235.92
</td>
<td align="right">
<font color="red">100%</font>
</td>
</tr>
</table>
</td>
我的代码看起来是这样的:
My code looks something like this:
private void ParseHtml(HtmlDocument htmlDoc)
{
var ItemsAndPrices = new Dictionary<string, int>();
var findItemPrices = from links in htmlDoc.DocumentNode.Descendants()
where links.Name.Equals("table") &&
links.Attributes["width"].Equals ("100%") &&
links.Attributes["bgcolor"].Equals("silver")
select new
{
//select item and price
}
在这种情况下,我想为s 选出这是牛仔裤和书包$项目C $ C>以及它们相关的
价格
下面,并将它们存储在一个字典中。
In this instance, I would like to select the item which are Jeans and Bookbags
as well as their associated prices
below and store them in a dictionary.
E.g Jeans at price $235.92
有谁知道如何通过htmlagility包正确做到这一点和LINQ?
Does anyone know how to do this properly via htmlagility pack and LINQ?
推荐答案
假设有可能是其他行,而你没有特别想要书包不仅和牛仔裤,我会像这样做:
Assuming that there could be other rows and you don't specifically want only Bookbags and Jeans, I'd do it like this:
var table = htmlDoc.DocumentNode
.SelectSingleNode("//table[@bgcolor='silver' and @width='100%']");
var query =
from row in table.Elements("tr").Skip(1) // skip the header row
let columns = row.Elements("td").Take(2) // take only the first two columns
.Select(col => col.InnerText.Trim())
.ToList()
select new
{
Info = columns[0],
Price = Decimal.Parse(columns[1], NumberStyles.Currency),
};
这篇关于使用解析HTML文档HtmlAgilityPack的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文