模拟点击网页上的链接 [英] simulate clicking link on web page

查看：31 发布时间：2021/10/2 18:42:46 r selenium xml-parsing

本文介绍了模拟点击网页上的链接的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试抓取网页下方

I am trying to scrape below webpage

http://www.houseoffraser.co.uk/Eliza+J+3/4+sleeve+ruched+waist+dress/165288648,default,pd.html

每种颜色/尺码组合的库存数据仅在选择颜色或尺码时显示.在 r 中是否可以模拟这个来获取数据.

The stock data for each colour/size combination appears only when the colour or size is selected. In r is it possible to simulate this to get the data.

到目前为止，我已经能够捕捉到颜色和大小

So far, I have been able to capture the colour and size

mcolour = toString(xpathSApply(page,'//ul[@class="colour-swatches-list toggle-panel"]//li[@title]',xmlGetAttr,"title"))

size = xpathSApply(page,'//ul[@class="size-swatches-list toggle-panel"]//li[@data-size]',xmlGetAttr,"data-size")

但我不确定如何捕捉每种颜色/尺寸组合的库存水平.

but I am not sure how capture stock levels per colour/size combination.

请多多指教！

==============================================================我找不到新的方法，我错过了什么吗?

============================================================ I could not find new as a method, Am I missing anything ?

firefoxClass
Generator for class "firefoxClass":

Class fields:

Name:  exceptionTable     javaWarMes     javaDriver   javaNavigate
Class:         matrix            ANY            ANY            ANY

Class Methods:  
"back", "callSuper", "close", "copy", "export", "field", "findElementByClassName", 
 "findElementByCssSelector", "findElementById", "findElementByLinkText",  "findElementByName", 
 "findElementByPartialLinkText", "findElementByTagName", "findElementByXPath", 
 "findElementsByClassName", "findElementsByCssSelector", "findElementsById", 
 "findElementsByLinkText", "findElementsByName", "findElementsByPartialLinkText", 
 "findElementsByTagName", "findElementsByXPath", "forward", "get", "getCapabilities", 
 "getClass", "getCurrentUrl", "getPageSource", "getRefClass", "getTitle", "getVersion", 
  "import", "initFields", "initialize", "initialize#exceptionClass", "printHtml",   "refresh", 
  "show", "show#envRefClass", "trace", "tryExc", "untrace", "usingMethods"


  Reference Superclasses:  
  "exceptionClass", "envRefClass"

推荐答案

对于可以从页面上抓取的给定产品 ID pid，您可以通过查询获取库存情况:

For a given product ID pid which you can scrape from the page, you can get stock availability by querying:

http://www.houseoffraser.co.uk/on/demandware.store/Sites-hof-Site/default/Product-UpdateQuantityList?pid=165288698&quantity=1

您甚至不需要为该查询设置任何 cookie.这将返回一个 HTML 和 javascript 块，用于在页面上设置控件.这是一个有限库存的例子(目前有 2 个，虽然我可能只是不小心买了所有的):

you don't even need to set any cookies for that query. That returns an HTML and javascript chunk that is used to set the control on the page. Here's an example of limited stock (currently 2, although I might have just bought all of them by accident):

http://www.houseoffraser.co.uk/on/demandware.store/Sites-hof-Site/default/Product-UpdateQuantityList?pid=165288648&quantity=1

您可以通过解析 availabilityMessage 字符串或