如何使用带有Selenium Webdriver的XPath提取XML数据 [英] How to extract XML data using XPath with Selenium Webdriver

查看:227
本文介绍了如何使用带有Selenium Webdriver的XPath提取XML数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Selenium Webdriver (ver 2.31.2.0)(.Net),并且试图提取从driver.PageSource'返回的元素(XML).

I'm using Selenium Webdriver (ver 2.31.2.0) (.Net) and I'm trying to extract an element (XML) which is returning from the `driver.PageSource'.

我的问题:如何使用下面的xpath获取项目列表. 我可以使用XPATH插件在FF中播放,但相同的代码在Selenium Webdriver中不起作用

My Question: How to get the list of items using the below xpath. I able to play in FF using XPATH addons but the same code does not work in Selenium Webdriver

有什么帮助吗?

这是我在Selenium Webdriver中的代码:

Here is my code in Selenium Webdriver:

var driver = new FirefoxDriver();
driver.Navigate().GoToUrl("http://website_name/languages.xml");
string _page_source = driver.PageSource;
ReadOnlyCollection<IWebElement> webElements = _page_source.FindElementsByXPath("//response//results//items/vList");

我的xml看起来像这样:

my xml looks like this:

<response xmlns="http://schemas.datacontract.org/2004/07/myproj.cnn.com">
xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
    <meta>

    </meta>
    <results i:type="vList">
        <name>Language</name>
        <queryValue>language</queryValue>
        <displayOrder>0</displayOrder>
        <items>
            <vList>
                <name>English</name>
                <displayName>English</displayName>
                <displayOrder>0</displayOrder>
                <items />
            </vList>
            <vList>
                <name>Swedish</name>
                <displayName>Swedish</displayName>
                <displayOrder>1</displayOrder>
                <items />
            </vList>
        </items>
    </results>
</response>

推荐答案

您可以使用selenium浏览并获取xml,但是可以使用.net类来处理xml.

You can use selenium to browse to and obtain the xml, but work with the xml using .net classes.

driver.PageSource属性是一个字符串,您应该直接使用.Net类来解析所表示的xml.另外,字符串对象上没有方法FindElementsByXPath(),除非这是您编写的扩展方法.

The driver.PageSource property is a string, and you should use .Net classes directly to parse the xml represented. Also, there is no method FindElementsByXPath() on a string object, unless this is an extension method that you have written.

使用硒中的driver.PageSource读取xml

Read the xml using the driver.PageSource from selenium

var driver = new FirefoxDriver();
driver.Navigate().GoToUrl("http://website_name/languages.xml");
XmlReader reader = XmlReader.Create(driver.PageSource);

或者,通过直接使用浏览URL来读取xml.

Or, read the xml by directly browsing to the url using

XmlReader reader = XmlReader.Create("http://website_name/languages.xml");

然后使用下面的代码来解析和读取xml. 需要注意的关键是如何将名称空间信息提供给xpath.

And then use below code to parse and read the xml. Key point to note is how the namespace information is provided to the xpath.

//load xml document
XElement xmlDocumentRoot = XElement.Load(reader);
//also add the namespace infn, chose a prefix for the default namespace
XmlNameTable nameTable = reader.NameTable;
XmlNamespaceManager namespaceManager = new XmlNamespaceManager(nameTable);
namespaceManager.AddNamespace("a", "http://schemas.datacontract.org/2004/07/myproj.cnn.com");

//now query with your xml - remeber to prefix the default namespace
var items = xmlDocumentRoot.XPathSelectElements("//a:results/a:items/a:vList", namespaceManager);

Console.WriteLine("vlist has {0} items.", items.Count());

foreach (var item in items)
{
Console.WriteLine("Display name: {0}", item.XPathSelectElement("a:displayName",namespaceManager).Value);
}
// OR get a list of all display names using linq
var displayNames = items.Select(x => x.XPathSelectElement("a:displayName", namespaceManager).Value).ToList();

您将需要以下名称空间才能使以上各项正常工作:

You will need the following namespaces for the above to work:

using System;
using System.Linq;
using System.Xml;
using System.Xml.Linq;
using System.Xml.XPath;

这篇关于如何使用带有Selenium Webdriver的XPath提取XML数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆