Html Agility Pack空值超出表格 [英] Html Agility Pack Empty Values out of Tables
问题描述
我正在努力学习一些基本的技巧,并感谢这个网站,我已经能够学到很多新东西,但现在我陷入了这个问题......这是我使用的代码:
I am trying to learn some basic scraping and thanks to this site I have been able to learn a lot of new things, but now I am stuck with this problem...This is the code I am using:
var web = new HtmlWeb();
var doc = web.Load("url");
var nodes = doc.DocumentNode.SelectNodes("//*[@id='hotellist_inner']/div");
StreamWriter output = new StreamWriter("out.txt");
if (nodes != null)
{
foreach (HtmlNode item in nodes)
{
if (item != null && item.Attributes["data-recommended"] != null)
{
string line = "";
var nome = item.SelectSingleNode(".//h3/a").InnerText;
var rating = item.SelectSingleNode(".//span[@class='rating']").InnerText;
var price = item.SelectSingleNode("./div[2]/div[3]/div[2]/table/tbody/tr/td[4]/div/strong[1]");
var discount = item.SelectSingleNode("./div[2]/div[3]/div[2]/table/tbody/tr/td[4]/div/div[1]");
line = line + nome + "," + rating + "," + price + "," + discount;
Console.WriteLine(line);
output.WriteLine(line);
}
}
}
前两个项目(名称和评级),但是当谈到价格和折扣时,我会得到空的结果。我已经分析了该页面(这里是链接)与铬刮刀,它可以很容易地使用我已经使用的xpath的结果。我不明白我做错了什么。
任何帮助将不胜感激! :D
It all works fine for the first two items (name and rating), but when it comes to price and discount I get empty results. I have analized the page (here is the link) with chrome scraper and it gets the results easily with the xpath I have used. I don't understand what I am doing wrong. Any help would be appreciated! :D
推荐答案
快速浏览一下您试图抓取的网页后,并非所有项目
有价格和折扣信息。您需要正确处理此案例以避免发生异常,例如在获取 InnerText
之前检查 null
。您的代码只需稍作更改即可获得价格和折扣信息:
After a quick look at the web page you're trying to scrape, not all item
has price and discount information. You need to handle this case properly to avoid exception, for example by checking for null
before getting the InnerText
. Your code with this slight change was able to get price and discount information where available :
if (item != null && item.Attributes["data-recommended"] != null)
{
string line = "";
var nome = item.SelectSingleNode(".//h3/a").InnerText;
var rating = item.SelectSingleNode(".//span[@class='rating']").InnerText;
var price = item.SelectSingleNode("./div[2]/div[3]/div[2]/table/tbody/tr/td[4]/div/strong[1]");
var discount = item.SelectSingleNode("./div[2]/div[3]/div[2]/table/tbody/tr/td[4]/div/div[1]");
//set priceString to empty string if price is null, else set it to price.InnerText
var priceString = price == null ? "" : price.InnerText;
//do similar step for discountString
var discountString = discount == null ? "" : discount.InnerText;
line = line + nome + "," + rating + "," + priceString + "," + discountString;
Console.WriteLine(line);
output.WriteLine(line);
}
这篇关于Html Agility Pack空值超出表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!