VBA - 网页抓取找不到正确的 GET 请求 [英] VBA - web scraping can not find correct GET request
问题描述
我的问题与其他问题有关 VBA - 网页抓取不能获取 HTMLElement 内部文本.我也有类似的问题
My question is related to other question VBA - web scraping can not get HTMLElement innerText. I have a similar problem
网站网址 - https://www.hnb.hr/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list
我需要获取货币参考日期和所选值.问题是我找不到最终生成这些值的正确 GET
请求.我发现它与 POST
请求有关:
I need to get the date of currency reference and the selected values. The problem is that I can not find a correct GET
request where these values are finally generated. I've found that it is related to the POST
request:
POST
/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list?p_p_id=tecajnalistacontroller_WAR_hnbtecajnalistaportlet&p_p_lifecycle=2&p_p_state=normal&p_p_mode=view&p_p_resource_id=getTecajnaAjaxDataURL&p_p_cacheability=cacheLevelPage&p_p_col_id=column-2&p_p_col_count=2 HTTP/1.1
POST
/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list?p_p_id=tecajnalistacontroller_WAR_hnbtecajnalistaportlet&p_p_lifecycle=2&p_p_state=normal&p_p_mode=view&p_p_resource_id=getTecajnaAjaxDataURL&p_p_cacheability=cacheLevelPage&p_p_col_id=column-2&p_p_col_count=2 HTTP/1.1
我想使用通过 id、class 或 tag 获取的技术 - 无论如何,前提是 GET
URL 请求太快而无法检索所需的信息
I would like to use a technique with getting by id, class or tag - whatever but again, provided GET
URL request is too quick to retrieve the required info
推荐答案
XMLHTTP 请求和 API:
我会使用他们的 API,如下所示.我有一些辅助函数来帮助解析响应.在GetDict
函数中你可以设置你感兴趣的货币.在函数GetRate
中你可以指定你感兴趣的汇率.如果你不指定,它默认为median_rate"
.
XMLHTTP request and API:
I would use their API as shown below. I have some helper functions to aid with parsing the response. In GetDict
function you can set the currencies you are interested in. In function GetRate
you can specify the rate you are interested in. If you don't specify, it defaults to "median_rate"
.
调用 API:
要获取特定日期的费率,请向以下网址:
To get the rates for a particular date, make a[n] HTTP call to the following URL:
http://hnbex.eu/api/v1/rates/daily/?date=YYYY-MM-DD
日期参数是可选的.如果没有设置,当前日期(今天)使用.
The date parameter is optional. If not set, the current date (today) is used.
您可以使用 JSON parser
但我发现使用 Split
从 JSON
字符串中获取所需的信息更简单.如果您熟悉 JSON
,我会很高兴地更新一个 JSON 解析示例.
You can parse the JSON
response with a JSON parser
but I found it simpler to go with using Split
to grab the required info from the JSON
string. If you are familiar with JSON
I will happily update with a JSON parsing example.
Option Explicit
Public Sub GetInfo()
'http://hnbex.eu/api/v1/
Dim strJSON As String, http As Object, json As Object
Const URL As String = "http://hnbex.eu/api/v1/rates/daily/"
Set http = CreateObject("MSXML2.XMLHTTP")
With http
.Open "GET", URL, False
.send
strJSON = .responseText
End With
'Set json = JsonConverter.ParseJson(strJSON) '<== You could parse the JSON using a JSON parse such as [JSONConverter][1]
Dim currencyDict As Object
Set currencyDict = GetDict
Dim key As Variant, dictKeys As Variant, result As Variant
For Each key In currencyDict.keys
result = GetRate(strJSON, key)
If Not IsError(result) Then currencyDict(key) = result
result = vbNullString
Next key
PrintDictionary currencyDict
End Sub
Public Function GetDict() As Object '<== You could adapt to pass currencies as string arguments to the function. Or even a string array.
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
dict.Add "EUR", vbNullString
dict.Add "CZK", vbNullString
dict.Add "HRK", vbNullString
dict.Add "HUF", vbNullString
dict.Add "PLN", vbNullString
dict.Add "RON", vbNullString
dict.Add "RSD", vbNullString
Set GetDict = dict
End Function
Public Function GetRate(ByVal json As String, ByVal key As Variant, Optional ByVal rate As String = "median_rate") As Variant
Dim arr() As String, tempString As String
On Error GoTo Errhand
arr = Split(json, """currency_code"": " & Chr$(34) & key & Chr$(34))
tempString = arr(1)
tempString = Split(arr(1), Chr$(34) & rate & Chr$(34) & ":")(1)
tempString = Split(tempString, ",")(0)
GetRate = tempString
Exit Function
Errhand:
GetRate = CVErr(xlErrNA)
End Function
Public Sub PrintDictionary(ByVal dict As Object)
Dim key As Variant
For Each key In dict.keys
Debug.Print key & " : " & dict(key)
Next
End Sub
<小时>
Internet Explorer:
您可以使用带有显式等待元素出现在页面(或填充)上的循环
Internet Explorer:
You can use an loop with explicit wait for element to be present on page (or populated)
Option Explicit
Public Sub GetInfo()
Dim IE As New InternetExplorer, t As Date, hTable As HTMLTable, clipboard As Object
Const WAIT_TIME_SECS As Long = 5
t = Timer
With IE
.Visible = True
.navigate "https://www.hnb.hr/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list"
While .Busy Or .readyState < 4: DoEvents: Wend
Do
DoEvents
On Error Resume Next
Set hTable = .document.getElementById("records_table")
On Error GoTo 0
If Timer - t > WAIT_TIME_SECS Then Exit Do
Loop While hTable Is Nothing
If hTable Is Nothing Then
.Quit
Exit Sub
End If
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
clipboard.SetText hTable.outerHTML
clipboard.PutInClipboard
ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
.Quit '<== Remember to quit application
End With
End Sub
这篇关于VBA - 网页抓取找不到正确的 GET 请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!