的webdriver可以使用XPath找到的元素,HTML敏捷性包不能 [英] WebDriver can find element using xpath, Html Agility Pack cannot

查看:206
本文介绍了的webdriver可以使用XPath找到的元素,HTML敏捷性包不能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我就不断有使用HTML敏捷性包的问题;我的XPath查询只会工作的时候,他们是非常简单的:

  // * [@ ID ='SOME_ID'] 

  //输入

不过,他们随时变得更加复杂,那么的Html敏捷性包不能处理它。
下面是一个例子证明,我使用webdriver的导航到谷歌,并返回页面的源代码,这是通过为Html敏捷性包,两者的webdriver和HtmlAgilityPack尝试查找该元件/节点(C#)的问题

  //的XPath查询
常量字符串的XPath =// //形式TR [1] / TD [1] //输入[@名称='q'];

//浏览到谷歌并获得页面的源代码
变种司机=新FirefoxDriver(新FirefoxProfile()){URL =htt​​p://www.google.com};
Thread.sleep代码(2000);

// webdriver的能找到它?
变种E = driver.FindElementByXPath(XPath的);
Console.WriteLine(E = NULL的webdriver成功!?的webdriver失败);

//的Html敏捷性包可以找到它?
无功源= driver.PageSource;
变种HTMLDOC =新的HTMLDocument {OptionFixNestedTags =真};
htmlDoc.LoadHtml(源);
VAR节点= htmlDoc.DocumentNode.SelectNodes(XPath的);
Console.WriteLine(节点= NULL的Html敏捷性包成功!?的Html敏捷性包失败);

driver.Quit();

在这种情况下,webdriver的成功所在的项目,但的Html敏捷性包都没有。



我知道,我知道,在这种情况下,它是很容易的XPath的改变之一,将工作: //输入[@名称='q'] ,但只解决这个具体的例子,这不是重点,我需要的东西会​​的究竟的或至少的密切的镜像的webdriver的XPath引擎的行为,甚至FirePath或火力发现者附加到Firefox。



如果webdriver的能找到它,那么为什么不能的Html敏捷性包嫌弃?


解决方案

您正在运行到这个问题是FORM元素。 HTML敏捷性包处理该元素不同 - 默认情况下,它永远不会报告说,它有孩子



在你给的特殊例子,这个查询确实找到了目标元素:



.// DIV / DIV [2] /表/ TR / TD /表/ TR / TD / DIV /表/ TR / TD / DIV / DIV [2] /输入



不过,这不,所以它的清除表单元素被绊倒了解析器:



.//形式/ DIV / DIV [2] /表/ TR / TD /表/ TR / TD / DIV /表/ TR / TD / DIV / DIV [2] /输入



这行为是可配置的,虽然。如果你把之前解析HTML这一行,表格会给你的子节点:





<预类=郎-CS prettyprint,覆盖> HtmlNode.ElementsFlags.Remove(形式);


I have continually had problems with Html Agility Pack; my XPath queries only ever work when they are extremely simple:

//*[@id='some_id']

or

//input

However, anytime they get more complicated, then Html Agility Pack can't handle it. Here's an example demonstrating the problem, I'm using WebDriver to navigate to Google, and return the page source, which is passed to Html Agility Pack, and both WebDriver and HtmlAgilityPack attempt to locate the element/node (C#):

//The XPath query
const string xpath = "//form//tr[1]/td[1]//input[@name='q']";

//Navigate to Google and get page source
var driver = new FirefoxDriver(new FirefoxProfile()) { Url = "http://www.google.com" };
Thread.Sleep(2000);

//Can WebDriver find it?
var e = driver.FindElementByXPath(xpath);
Console.WriteLine(e!=null ? "Webdriver success" : "Webdriver failure");

//Can Html Agility Pack find it?
var source = driver.PageSource;
var htmlDoc = new HtmlDocument { OptionFixNestedTags = true };
htmlDoc.LoadHtml(source);
var nodes = htmlDoc.DocumentNode.SelectNodes(xpath);
Console.WriteLine(nodes!=null ? "Html Agility Pack success" : "Html Agility Pack failure");

driver.Quit();

In this case, WebDriver successfully located the item, but Html Agility Pack did not.

I know, I know, in this case it's very easy to change the xpath to one that will work: //input[@name='q'], but that will only fix this specific example, which isn't the point, I need something that will exactly or at least closely mirror the behavior of WebDriver's xpath engine, or even the FirePath or FireFinder add-ons to Firefox.

If WebDriver can find it, then why can't Html Agility Pack find it too?

解决方案

The issue you're running into is with the FORM element. HTML Agility Pack handles that element differently - by default, it will never report that it has children.

In the particular example you gave, this query does find the target element:

.//div/div[2]/table/tr/td/table/tr/td/div/table/tr/td/div/div[2]/input

However, this does not, so it's clear the form element is tripping up the parser:

.//form/div/div[2]/table/tr/td/table/tr/td/div/table/tr/td/div/div[2]/input

That behavior is configurable, though. If you place this line prior to parsing the HTML, the form will give you child nodes:

HtmlNode.ElementsFlags.Remove("form");

这篇关于的webdriver可以使用XPath找到的元素,HTML敏捷性包不能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆