C#HTML Agility Pack单一选择节点返回null [英] C# HTML Agility Pack Single Select Node returning null

查看:57
本文介绍了C#HTML Agility Pack单一选择节点返回null的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用C#,Windows窗体和HTML Agility Pack开发的Web抓取工具.

I have a web scraper developed using C#, windows forms and the HTML Agility Pack.

当网站更改其代码并将其破解时,我的一切工作都很好.我知道它经常发生在刮板机上,但是现在我在弄清楚如何解决该问题上遇到了麻烦.

I had it all working great when the site changed it's code and broke it. I know it happens often with web scrapers but now I am having trouble figuring out how to correct the issue.

这时,我的抓取工具会循环访问多个URL,并从每个页面抓取数据.

At this time my scraper loops thru multiple URL's and scrapes data from each page.

我遇到的问题是,它循环通过的网站模板将随机显示较新的模板,该模板不具有我在程序中定义的相同的HTML类和ID.我想做的是运行一个简单的if,它检查单个节点是否为null以及是否为新模板运行单独的代码集.

The problem I am running into is that the template of the site it loops thru will randomly show the newer template which does not have the same HTML classes and ID's that I have defined in the program. What I am trying to do is run a simple if that checks if a single node if null and if it is runs a separate set of code for the new template.

我遇到的问题是我的程序在if语句上抛出了NullReferenceException.

The problem I am having is that my program throws a NullReferenceException on my if statement.

这是我用来检查其是否为空的语句:

Here is the statement I am using to check if it is null:

var varitem = doc.DocumentNode.SelectSingleNode("//h1[@class='producttitle']").InnerText;

 if (varitem == null) MessageBox.Show("no titles");

它在定义变量的第一行引发异常,甚至没有在if语句中出现.

It throws the exception at the first line defining the varitem and doesn't even make it to the if statement.

任何建议表示赞赏!

推荐答案

首先,您应该检查

 doc.DocumentNode.SelectSingleNode("//h1[@class='producttitle']")

返回空值.

如果为空,您将从null.InnerText

这篇关于C#HTML Agility Pack单一选择节点返回null的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆