在Htmlagilitypack中使用Xpath查找元素 [英] Finding element using Xpath in Htmlagilitypack

查看:269
本文介绍了在Htmlagilitypack中使用Xpath查找元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我正在构建一个抓取软件。我已经使用HTMlAgilitypack加载了网页。



现在我必须使用xpath表达式选择Elements。我已经构建了表达式

// html / body / div [5] / div / div / div [2] / div [0] / div / div [1] / div [0] 。这不行。



当我尝试使用

Hi,
I am building one scraping software. I have loaded the web page using HTMlAgilitypack.

Now that I have to select Elements using the xpath expression. I have build the expression
"//html/body/div[5]/div/div/div[2]/div[0]/div/div[1]/div[0]".This is not working.

When I tried using

doc.DocumentNode.SelectNodes("html") => working // even ("body") => working
but if I use
doc.DocumentNode.SelectNodes("html/body")=> not working





是否有人知道如何通过HtmlAgilityPack使用上面的表达来识别元素。我已经尽力搜索互联网,但是没有最佳解决方案。就我而言,Xpath每次都会变化。所以我正在寻找能够给出Xpath元素的东西。



感谢您的回复。



Does any body know how to identify elements using the above expression through HtmlAgilityPack.I have tried my best in searching the internet, but there is no optimal solution for this. In my case the Xpath varies every time. So I am looking for something that would give me the element given the Xpath.

Thanks for your reply.

推荐答案

检查以下结果

check below results
var html = "\r\n<html>\r\n<body>\r\n\r\n<p>This is a paragraph.</p>\r\n<p>This is a para" +
"graph.</p>\r\n<p>This is a paragraph.</p>\r\n\r\n</body>\r\n</html>\r\n";
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(html);
var t1 = doc.DocumentNode.SelectNodes("html"); //working
var t2 = doc.DocumentNode.SelectNodes("body"); //Not working
var t3 = doc.DocumentNode.SelectNodes("html[1]/body"); //working
var t4 = doc.DocumentNode.SelectNodes("html/body"); //working
var t5 = doc.DocumentNode.SelectNodes("//body"); //working



here doc.DocumentNode.SelectNodes( 身); 未提供任何结果,因为文档节点级别中没有 body 节点。但是您可以使用 // body xpath来获取文档中的任何位置。


here doc.DocumentNode.SelectNodes("body"); is not giving any results because there is no body node in document node level. but you can use //body xpath to get node anywhere in the document.


尝试使用以下给定链接中的此插件

http://watin.org/ [ ^ ]
Try using this plugin from the below given link
http://watin.org/[^]


这篇关于在Htmlagilitypack中使用Xpath查找元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆