使用xpath或cssSelector解析HTML? [英] Parsing HTML with xpath or cssSelector?

查看:104
本文介绍了使用xpath或cssSelector解析HTML?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何仅解析这些代码块的文本部分?我在Java中使用Selenium客户端驱动程序.

How do I parse for just the text portions of these blocks of code? I am using Selenium client drivers in java.

<li id="NOT_PUT_PREF_STORE" style="">
<span id="STORE_AVAIL" class="BodyLBoldGrey StockStat">Out of stock</span> <span id="InYourLocal">in your local</span> <span id="storeRollover_2"><span id="STORE_CITY" class="BodyLBoldLtgry VIBSStore1">West Hills</span></span> store<span id="notSelectOptionSOI">.</span>
</li>

<li id="NOT_PUT_PREF_STORE" style="">
<span id="STORE_AVAIL" class="BodyLLtgry StockStat">Not carried</span> <span class="BodyLLtgry" id="InYourLocal">in your local</span> <span id="storeRollover_2"><span id="STORE_CITY" class="BodyLBoldLtgry VIBSStore1">West Hills</span></span> store<span id="notSelectOptionSOI">.</span>
</li>

<li id="NOT_PUT_PREF_STORE" style="">
<span id="STORE_AVAIL" class="BodyMBold StockStatGreen">In stock</span> <span id="InYourLocal">in your local</span> <span id="storeRollover_2"><span id="STORE_CITY" class="BodyLBoldLtgry VIBSStore1">West Hills</span></span> store<span id="notSelectOptionSOI">.</span>
</li>

我正在尝试分析Web元素中每种变体中的文本部分(即:未携带,有货,无货).我是selenium和html解析的新手,因此这对我来说真的很难发挥作用.

I am trying to parse for the text portion in each of these variations in the webelement (ie: Not carried, In stock, Out of stock). I am a very new user to selenium and html parsing so this is really hard for me to get functional.

我当时以为会是这样

WebElement driver = new FirefoxDriver(profile);
driver.get(Url);
System.out.println(driver.getElement(By.id("STORE_AVAIL").getText());

不知道如何使用cssSelector来做到这一点,但人们告诉我这更快. 这行得通吗?

Not sure how I would do it with cssSelector but people tell me that is faster. Would this work?

driver.getElement(By.xpath("//li[@id='NOT_PUT_PREF_STORE']./span[@id='STORE_AVAIL']").getText()

推荐答案

当我尝试在页面上查找元素时,我总是通过以下方式构建定位器:

When I try to find elements on the page I always build my locators by:

  1. id = driver.getElement(By.id("STORE_AVAIL").getText());
  2. css选择器= driver.getElement(By.css("span#STORE_AVAIL").getText());
  3. xpath = driver.getElement(By.xpath("//span[@id='STORE_AVAIL']").getText());
  1. id = driver.getElement(By.id("STORE_AVAIL").getText());
  2. css selector = driver.getElement(By.css("span#STORE_AVAIL").getText());
  3. xpath = driver.getElement(By.xpath("//span[@id='STORE_AVAIL']").getText());

对于Webdriver和我来说,该ID似乎都是最快,最简单的. id在页面上应该是唯一的.

The id seems to be the fastest and easiest, both for webdriver and for me. id should be unique on the page.

CSS方面需要我做更多的调查工作,但是webdriver可以很好地处理它.

CSS take a little more investigative work on my part, but webdriver handles it just fine.

最后,xpath有时是不可避免的(除非您购买了开发人员的啤酒,并且很好地要求更改应用程序,以便您可以更快地找到它-毕竟,您仍在对其进行测试).使用IE通过xpath定位非常慢,而编写复杂的xpath则很麻烦.

Lastly, xpath is sometimes unavoidable (unless you buy the devs a beer and ask nicely to change to application so you can locate it faster - after all, you are testing for them anyway). Locating by xpath with IE is terribly slow and writing complex xpaths is a drag.

Xpath也很脆弱,对dom的一小处改动就会使您的xpath无法使用.然后,您可以调试/重写xpath(听起来很有趣).

Xpath is also fragile, one small change to the dom can render your xpath unusable. Then you get to debug/rewrite your xpath (it is as fun as it sounds).

我的建议是使用Firefox的Firebug和FirePath插件来帮助您制作定位器.

My suggestion is to use Firebug and FirePath addons for Firefox to help you craft your locators.

这篇关于使用xpath或cssSelector解析HTML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆