使用VBA在Web上屏蔽的URL [英] Web scrapping of masked URL using VBA

查看:119
本文介绍了使用VBA在Web上屏蔽的URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从网站



您需要通过VBA重现该请求并解析HTML响应。有一个示例显示了如何执行此操作:

  Option Explicit 

Sub Test()

'添加引用
'Microsoft HTML对象库
'Microsoft XML,v6.0

Dim sResp As String
Dim rOutputCell作为范围
Dim oElememnt
Dim cElements作为IHTMLElementCollection
Dim oTableRow
Dim oTableCell

'从网站
检索HTML使用新的MSXML2.XMLHTTP60
'发送请求
。打开 POST, https://dps.psx.com.pk/webpages/SL_main_page.php,True
.SetRequestHeader内容类型,应用程序/ x-www-form-urlencoded
。发送 symbolCode = EFOODS
直到.ReadyState = 4:DoEvents:循环
sResp = .ResponseText

结尾'解析响应并使用新的HTMLDocument输出

'将响应HTML加载到DOM
.body.inne rHTML = sResp
'清除输出的第一个工作表
ThisWorkbook.Sheets(1).Cells.Delete
'解析SL_cmpInfo表并输出
Set rOutputCell = ThisWorkbook.Sheets(1) .Cells(1,1)
Set oElememnt = .getElementsByClassName( SL_cmpText)(0)
rOutputCell.Value = oElememnt.innerText
'解析SL_mktStats1表并输出
Set rOutputCell = Cells(3,1)
设置cElements = .getElementsByClassName( SL_mktStats1)
对于数组(cElements(1),cElements(2),cElements(3))$ b $中的每个元素b对于oElememnt.getElementsByTagName( tr)中的每个oTableRow
对于oTableRow.getElementsByTagName( td)
中的每个oTableCell
rOutputCell.Value = oTableCell.innerText
设置rOutputCell = rOutputCell.Offset (0,1)
下一个
设置rOutputCell = rOutputCell.Offset(1,0).EntireRow.Cells(1,1)
下一个
下一个
'解析SL_announce表并输出
设置rOutputCell = rOutputCell.Offset(1,0)
设置oElememnt = .getElementsByClassName( SL_announce)(0 )
对于oElememnt.getElementsByTagName( tr)中的每个oTableRow
对于oTableRow.getElementsByTagName( td)中的每个oTableCell
rOutputCell.Value = oTableCell.innerText
设置rOutputCell = rOutputCell.Offset(0,1)
下一个
设置rOutputCell = rOutputCell.Offset(1,0).EntireRow.Cells(1,1)
下一个
以$结尾b $ b MsgBox已完成

结束子

别忘了添加必要的引用:





根据需要:




I want to scrape some stock data from a website https://dps.psx.com.pk/ using VBA in Excel, but the problem is the URL of this website does not change.

When I click on market summary as highlighted in image#1 Image#1

that will return the whole market summary, I just need to scrape data in Excel using VBA as highlighted in the image#2. Image#2

I tried to examine the network with fiddler as shown in image#3 Image#3

and develop the following code in VBA.

Option Explicit

Sub Test()

    ' Add references
    ' Microsoft HTML Object Library
    ' Microsoft XML, v6.0

    Dim sResp As String
    Dim rOutputCell As Range
    Dim oElememnt
    Dim cElements As IHTMLElementCollection
    Dim oTableRow
    Dim oTableCell



    ' Retrieve HTML from website
    With New MSXML2.XMLHTTP60
        ' Send request
        .Open "GET", "https://dps.psx.com.pk/webpages/mktSummary.php?r=REG", True
        Do Until .ReadyState = 4: DoEvents: Loop
        sResp = .ResponseText
    End With
    
    
    ' Parse response and output
    With New HTMLDocument
        ' Load response HTML into DOM
        .body.innerHTML = sResp
        ' Clear first worksheet for output
        ThisWorkbook.Sheets(1).Cells.Delete
        
        Set rOutputCell = Cells(3, 1)
        Set oElememnt = .getElementsByClassName("tableHead")(0)
        For Each oTableRow In oElememnt.getElementsByTagName("tr")
            For Each oTableCell In oTableRow.getElementsByTagName("td")
                rOutputCell.Value = oTableCell.innerText
                Set rOutputCell = rOutputCell.Offset(0, 1)
            Next
            Set rOutputCell = rOutputCell.Offset(1, 0).EntireRow.Cells(1, 1)
        Next
    
      End With

    MsgBox "Completed"

End Sub

But when I run this code it just shows running but nothing happens even after waiting for sometime. I don't know whether it got stuck in Event Loop or some other problem is there please help.

解决方案

All the necessary info to scrape that data you may find in captured by Fiddler request which is logged after you made an input of the quote symbol in a browser manual:

You need to reproduce that request via VBA and parse HTML response. There is the example showing how that might be done:

Option Explicit

Sub Test()

    ' Add references
    ' Microsoft HTML Object Library
    ' Microsoft XML, v6.0

    Dim sResp As String
    Dim rOutputCell As Range
    Dim oElememnt
    Dim cElements As IHTMLElementCollection
    Dim oTableRow
    Dim oTableCell

    ' Retrieve HTML from website
    With New MSXML2.XMLHTTP60
        ' Send request
        .Open "POST", "https://dps.psx.com.pk/webpages/SL_main_page.php", True
        .SetRequestHeader "Content-Type", "application/x-www-form-urlencoded"
        .Send "symbolCode=EFOODS"
        Do Until .ReadyState = 4: DoEvents: Loop
        sResp = .ResponseText
    End With
    ' Parse response and output
    With New HTMLDocument
        ' Load response HTML into DOM
        .body.innerHTML = sResp
        ' Clear first worksheet for output
        ThisWorkbook.Sheets(1).Cells.Delete
        ' Parse SL_cmpInfo table and output
        Set rOutputCell = ThisWorkbook.Sheets(1).Cells(1, 1)
        Set oElememnt = .getElementsByClassName("SL_cmpText")(0)
        rOutputCell.Value = oElememnt.innerText
        ' Parse SL_mktStats1 tables and output
        Set rOutputCell = Cells(3, 1)
        Set cElements = .getElementsByClassName("SL_mktStats1")
        For Each oElememnt In Array(cElements(1), cElements(2), cElements(3))
            For Each oTableRow In oElememnt.getElementsByTagName("tr")
                For Each oTableCell In oTableRow.getElementsByTagName("td")
                    rOutputCell.Value = oTableCell.innerText
                    Set rOutputCell = rOutputCell.Offset(0, 1)
                Next
                Set rOutputCell = rOutputCell.Offset(1, 0).EntireRow.Cells(1, 1)
            Next
        Next
        ' Parse SL_announce table and output
        Set rOutputCell = rOutputCell.Offset(1, 0)
        Set oElememnt = .getElementsByClassName("SL_announce")(0)
        For Each oTableRow In oElememnt.getElementsByTagName("tr")
            For Each oTableCell In oTableRow.getElementsByTagName("td")
                rOutputCell.Value = oTableCell.innerText
                Set rOutputCell = rOutputCell.Offset(0, 1)
            Next
            Set rOutputCell = rOutputCell.Offset(1, 0).EntireRow.Cells(1, 1)
        Next
    End With
    MsgBox "Completed"

End Sub

Don't forget to add the necessary references:

The output for me is as follows:

As required:

这篇关于使用VBA在Web上屏蔽的URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆