在点击选项时用硒刮HTML [英] Scraping HTML with selenium while clicking options

查看:158
本文介绍了在点击选项时用硒刮HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  Sub Body_Building()
Dim驱动程序作为New WebDriver,发布As Object

驱动程序
。开始chrome,http://www.bodybuildingwarehouse.co.uk
。获取/最佳营养?限制=全部
结束

错误恢复下一步
每个帖子在driver.FindElementsByClass(grid-info)
i = i + 1:Cells(i,1)= post.FindElementByClass(product-name)。文本
单元格(i,2)= post.FindElementByXPath(.// span [@ class =正常价格'] // span [@ class ='price'] | .// p [@ class ='special-price'] // span [@ class ='price'])。Text
下一篇文章
End Sub

是否可以从



请看VBA的工作是否符合预期的结果。谢谢 SMth80

  Sub optigura_scraper_v2()
Dim driver As New ChromeDriver
Dim elems As Object,Post As Object

driver.Gethttps://www.optigura.com/uk/product/gold-standard-100-whey/
[A1 :D1] .Value = [{Name,Flavor,Size,Price}]

设置elems = driver.FindElementsByXPath(// span [@ class ='img '] / img)
i = 2

对于n = 1 To elems.Count
driver.FindElementsByXPath(// span [@ class ='img'] / img )(n)。点击
driver.Wait 1000
对于每个帖子在driver.FindElementsByXPath(// div [@ class ='colright'] // ul [@ class ='opt2'] // label)
单元格(i,1)= driver.FindElementByXPath(// h1 [@ itemprop ='name'])。文本
单元格(i,2)= post.Text
Cells(i,3)= Split(driver.FindElementByXPath(// li [@ class ='active'] // span [@ class ='img'] / img)Attribute(alt ), - )(1)
细胞(i,4)= driver.FindElementByXPath(// span [@ class ='price'])。Text
i = i + 1
下一篇文章
下一页n
End Sub


解决方案

这当然不是最好的技术。但是,这将有助于您的目的。 Btw,刮刀将会精确地解析数据在该页面中的显示方式。

  Sub optigura_scraper()
Dim driver As New ChromeDriver
Dim elems As Object,Post As Object

driver.Gethttps://www.optigura.com/uk/product/gold-standard-100-whey/
[A1:D1] .Value = [{Name ,Price,Size,Flavor}]

设置elems = driver.FindElementsByXPath(// span [@ class ='img'] / img)
i = 2

对于N = 1到elems.Count
driver.FindElementsByXPath(// span [@ class ='img'] / img)(N).Click
driver.Wait 1000
单元格(i,1)= driver.FindElementByXPath(// h1 [@ itemprop ='name'])。文本
单元格(i,2)= driver.FindElementByXPath (//span[@class='price']\").Text
Cells(i,3)= Split(driver.FindElementByXPath(// li [@ class ='active'] // span [ // @ class ='img'] / img)属性(alt), - )(1)
对于每个帖子在driver.FindElementsByXPath(// div [@ class ='colright] // ul [@ class ='opt2'] // label)
单元格(i,4 )= post.Text
i = i + 1
下一个帖子
下一个N
End Sub


I have got a script which I use to scrape off data from the websites using selenium.

    Sub Body_Building()
    Dim driver As New WebDriver, post As Object

    With driver
        .Start "chrome", "http://www.bodybuildingwarehouse.co.uk"
        .Get "/optimum-nutrition?limit=all"
    End With

    On Error Resume Next
    For Each post In driver.FindElementsByClass("grid-info")
        i = i + 1: Cells(i, 1) = post.FindElementByClass("product-name").Text
        Cells(i, 2) = post.FindElementByXPath(".//span[@class='regular-price']//span[@class='price']|.//p[@class='special-price']//span[@class='price']").Text
    Next post
End Sub

Would it be possible to scrape off data from this website using the same or similar technique so the outcome would be like below in the snapshot?

Please see the VBA working so it matched the desired outcome. Thank you SMth80

Sub optigura_scraper_v2()
    Dim driver As New ChromeDriver
    Dim elems As Object, post As Object

    driver.Get "https://www.optigura.com/uk/product/gold-standard-100-whey/"
    [A1:D1].Value = [{"Name","Flavor","Size","Price"}]

    Set elems = driver.FindElementsByXPath("//span[@class='img']/img")
    i = 2

    For n = 1 To elems.Count
        driver.FindElementsByXPath("//span[@class='img']/img")(n).Click
        driver.Wait 1000
        For Each post In driver.FindElementsByXPath("//div[@class='colright']//ul[@class='opt2']//label")
            Cells(i, 1) = driver.FindElementByXPath("//h1[@itemprop='name']").Text
            Cells(i, 2) = post.Text
            Cells(i, 3) = Split(driver.FindElementByXPath("//li[@class='active']//span[@class='img']/img").Attribute("alt"), "-")(1)
            Cells(i, 4) = driver.FindElementByXPath("//span[@class='price']").Text
            i = i + 1
        Next post
    Next n
End Sub

解决方案

Check it out. This is certainly not the best technique. However, it will serve your purpose. Btw, the scraper will parse exactly how the data is displayed in that page.

Sub optigura_scraper()
    Dim driver As New ChromeDriver
    Dim elems As Object, post As Object

    driver.Get "https://www.optigura.com/uk/product/gold-standard-100-whey/"
    [A1:D1].Value = [{"Name","Price","Size","Flavor"}]

    Set elems = driver.FindElementsByXPath("//span[@class='img']/img")
    i = 2

    For N = 1 To elems.Count
        driver.FindElementsByXPath("//span[@class='img']/img")(N).Click
        driver.Wait 1000
        Cells(i, 1) = driver.FindElementByXPath("//h1[@itemprop='name']").Text
        Cells(i, 2) = driver.FindElementByXPath("//span[@class='price']").Text
        Cells(i, 3) = Split(driver.FindElementByXPath("//li[@class='active']//span[@class='img']/img").Attribute("alt"), "-")(1)
        For Each post In driver.FindElementsByXPath("//div[@class='colright']//ul[@class='opt2']//label")
            Cells(i, 4) = post.Text
            i = i + 1
        Next post
    Next N
End Sub

这篇关于在点击选项时用硒刮HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆