VBA - 网页抓取找不到正确的 GET 请求 [英] VBA - web scraping can not find correct GET request

查看:82
本文介绍了VBA - 网页抓取找不到正确的 GET 请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题与其他问题有关 VBA - 网页抓取不能获取 HTMLElement 内部文本.我也有类似的问题

My question is related to other question VBA - web scraping can not get HTMLElement innerText. I have a similar problem

网站网址 - https://www.hnb.hr/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list

我需要获取货币参考日期和所选值.问题是我找不到最终生成这些值的正确 GET 请求.我发现它与 POST 请求有关:

I need to get the date of currency reference and the selected values. The problem is that I can not find a correct GET request where these values are finally generated. I've found that it is related to the POST request:

POST/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list?p_p_id=tecajnalistacontroller_WAR_hnbtecajnalistaportlet&p_p_lifecycle=2&p_p_state=normal&p_p_mode=view&p_p_resource_id=getTecajnaAjaxDataURL&p_p_cacheability=cacheLevelPage&p_p_col_id=column-2&p_p_col_count=2 HTTP/1.1

POST /en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list?p_p_id=tecajnalistacontroller_WAR_hnbtecajnalistaportlet&p_p_lifecycle=2&p_p_state=normal&p_p_mode=view&p_p_resource_id=getTecajnaAjaxDataURL&p_p_cacheability=cacheLevelPage&p_p_col_id=column-2&p_p_col_count=2 HTTP/1.1

我想使用通过 id、class 或 tag 获取的技术 - 无论如何,前提是 GET URL 请求太快而无法检索所需的信息

I would like to use a technique with getting by id, class or tag - whatever but again, provided GET URL request is too quick to retrieve the required info

推荐答案

XMLHTTP 请求和 API:

我会使用他们的 API,如下所示.我有一些辅助函数来帮助解析响应.在GetDict函数中你可以设置你感兴趣的货币.在函数GetRate中你可以指定你感兴趣的汇率.如果你不指定,它默认为median_rate".

XMLHTTP request and API:

I would use their API as shown below. I have some helper functions to aid with parsing the response. In GetDict function you can set the currencies you are interested in. In function GetRate you can specify the rate you are interested in. If you don't specify, it defaults to "median_rate".

调用 API:

要获取特定日期的费率,请向以下网址:

To get the rates for a particular date, make a[n] HTTP call to the following URL:

http://hnbex.eu/api/v1/rates/daily/?date=YYYY-MM-DD

日期参数是可选的.如果没有设置,当前日期(今天)使用.

The date parameter is optional. If not set, the current date (today) is used.

您可以使用 JSON parser 但我发现使用 SplitJSON 字符串中获取所需的信息更简单.如果您熟悉 JSON,我会很高兴地更新一个 JSON 解析示例.

You can parse the JSON response with a JSON parser but I found it simpler to go with using Split to grab the required info from the JSON string. If you are familiar with JSON I will happily update with a JSON parsing example.

Option Explicit

Public Sub GetInfo()
    'http://hnbex.eu/api/v1/
    Dim strJSON As String, http As Object, json As Object
    Const URL As String = "http://hnbex.eu/api/v1/rates/daily/"

    Set http = CreateObject("MSXML2.XMLHTTP")
    With http
        .Open "GET", URL, False
        .send
        strJSON = .responseText
    End With
    'Set json = JsonConverter.ParseJson(strJSON) '<== You could parse the JSON using a JSON parse such as [JSONConverter][1]

    Dim currencyDict As Object
    Set currencyDict = GetDict

    Dim key As Variant, dictKeys As Variant, result As Variant
    For Each key In currencyDict.keys
        result = GetRate(strJSON, key)
        If Not IsError(result) Then currencyDict(key) = result
        result = vbNullString
    Next key

    PrintDictionary currencyDict

End Sub

Public Function GetDict() As Object '<== You could adapt to pass currencies as string arguments to the function. Or even a string array.
    Dim dict As Object
    Set dict = CreateObject("Scripting.Dictionary")
    dict.Add "EUR", vbNullString
    dict.Add "CZK", vbNullString
    dict.Add "HRK", vbNullString
    dict.Add "HUF", vbNullString
    dict.Add "PLN", vbNullString
    dict.Add "RON", vbNullString
    dict.Add "RSD", vbNullString
    Set GetDict = dict
End Function

Public Function GetRate(ByVal json As String, ByVal key As Variant, Optional ByVal rate As String = "median_rate") As Variant
    Dim arr() As String, tempString As String
    On Error GoTo Errhand
    arr = Split(json, """currency_code"": " & Chr$(34) & key & Chr$(34))
    tempString = arr(1)
    tempString = Split(arr(1), Chr$(34) & rate & Chr$(34) & ":")(1)
    tempString = Split(tempString, ",")(0)
    GetRate = tempString
    Exit Function
Errhand:
    GetRate = CVErr(xlErrNA)
End Function

Public Sub PrintDictionary(ByVal dict As Object)
    Dim key As Variant
    For Each key In dict.keys
        Debug.Print key & " : " & dict(key)
    Next
End Sub

<小时>

Internet Explorer:

您可以使用带有显式等待元素出现在页面(或填充)上的循环


Internet Explorer:

You can use an loop with explicit wait for element to be present on page (or populated)

Option Explicit
Public Sub GetInfo()
    Dim IE As New InternetExplorer, t As Date, hTable As HTMLTable, clipboard As Object
    Const WAIT_TIME_SECS As Long = 5
    t = Timer

    With IE
        .Visible = True
        .navigate "https://www.hnb.hr/en/core-functions/monetary-policy/exchange-rate-list/exchange-rate-list"

        While .Busy Or .readyState < 4: DoEvents: Wend

        Do
            DoEvents
            On Error Resume Next
            Set hTable = .document.getElementById("records_table")
            On Error GoTo 0
            If Timer - t > WAIT_TIME_SECS Then Exit Do
        Loop While hTable Is Nothing

        If hTable Is Nothing Then
            .Quit
            Exit Sub
        End If
        Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
        clipboard.SetText hTable.outerHTML
        clipboard.PutInClipboard
        ThisWorkbook.Worksheets("Sheet1").Cells(1, 1).PasteSpecial
        .Quit                                    '<== Remember to quit application
    End With
End Sub

这篇关于VBA - 网页抓取找不到正确的 GET 请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆