使用xpath或cssSelector解析HTML? [英] Parsing HTML with xpath or cssSelector?
问题描述
如何仅解析这些代码块的文本部分?我在Java中使用Selenium客户端驱动程序.
How do I parse for just the text portions of these blocks of code? I am using Selenium client drivers in java.
<li id="NOT_PUT_PREF_STORE" style="">
<span id="STORE_AVAIL" class="BodyLBoldGrey StockStat">Out of stock</span> <span id="InYourLocal">in your local</span> <span id="storeRollover_2"><span id="STORE_CITY" class="BodyLBoldLtgry VIBSStore1">West Hills</span></span> store<span id="notSelectOptionSOI">.</span>
</li>
或
<li id="NOT_PUT_PREF_STORE" style="">
<span id="STORE_AVAIL" class="BodyLLtgry StockStat">Not carried</span> <span class="BodyLLtgry" id="InYourLocal">in your local</span> <span id="storeRollover_2"><span id="STORE_CITY" class="BodyLBoldLtgry VIBSStore1">West Hills</span></span> store<span id="notSelectOptionSOI">.</span>
</li>
或
<li id="NOT_PUT_PREF_STORE" style="">
<span id="STORE_AVAIL" class="BodyMBold StockStatGreen">In stock</span> <span id="InYourLocal">in your local</span> <span id="storeRollover_2"><span id="STORE_CITY" class="BodyLBoldLtgry VIBSStore1">West Hills</span></span> store<span id="notSelectOptionSOI">.</span>
</li>
我正在尝试分析Web元素中每种变体中的文本部分(即:未携带,有货,无货).我是selenium和html解析的新手,因此这对我来说真的很难发挥作用.
I am trying to parse for the text portion in each of these variations in the webelement (ie: Not carried, In stock, Out of stock). I am a very new user to selenium and html parsing so this is really hard for me to get functional.
我当时以为会是这样
WebElement driver = new FirefoxDriver(profile);
driver.get(Url);
System.out.println(driver.getElement(By.id("STORE_AVAIL").getText());
不知道如何使用cssSelector来做到这一点,但人们告诉我这更快. 这行得通吗?
Not sure how I would do it with cssSelector but people tell me that is faster. Would this work?
driver.getElement(By.xpath("//li[@id='NOT_PUT_PREF_STORE']./span[@id='STORE_AVAIL']").getText()
推荐答案
当我尝试在页面上查找元素时,我总是通过以下方式构建定位器:
When I try to find elements on the page I always build my locators by:
- id =
driver.getElement(By.id("STORE_AVAIL").getText());
- css选择器=
driver.getElement(By.css("span#STORE_AVAIL").getText());
- xpath =
driver.getElement(By.xpath("//span[@id='STORE_AVAIL']").getText());
- id =
driver.getElement(By.id("STORE_AVAIL").getText());
- css selector =
driver.getElement(By.css("span#STORE_AVAIL").getText());
- xpath =
driver.getElement(By.xpath("//span[@id='STORE_AVAIL']").getText());
对于Webdriver和我来说,该ID似乎都是最快,最简单的. id在页面上应该是唯一的.
The id seems to be the fastest and easiest, both for webdriver and for me. id should be unique on the page.
CSS方面需要我做更多的调查工作,但是webdriver可以很好地处理它.
CSS take a little more investigative work on my part, but webdriver handles it just fine.
最后,xpath有时是不可避免的(除非您购买了开发人员的啤酒,并且很好地要求更改应用程序,以便您可以更快地找到它-毕竟,您仍在对其进行测试).使用IE通过xpath定位非常慢,而编写复杂的xpath则很麻烦.
Lastly, xpath is sometimes unavoidable (unless you buy the devs a beer and ask nicely to change to application so you can locate it faster - after all, you are testing for them anyway). Locating by xpath with IE is terribly slow and writing complex xpaths is a drag.
Xpath也很脆弱,对dom的一小处改动就会使您的xpath无法使用.然后,您可以调试/重写xpath(听起来很有趣).
Xpath is also fragile, one small change to the dom can render your xpath unusable. Then you get to debug/rewrite your xpath (it is as fun as it sounds).
我的建议是使用Firefox的Firebug和FirePath插件来帮助您制作定位器.
My suggestion is to use Firebug and FirePath addons for Firefox to help you craft your locators.
这篇关于使用xpath或cssSelector解析HTML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!