用 xmlhttp 抓取 [英] Scrape with xmlhttp
本文介绍了用 xmlhttp 抓取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想从 https://www.goaloong.net/football/6in1<获取数据/a>此页面包含一个表格.
I would like to get data from https://www.goaloong.net/football/6in1 This page contains a table.
我尝试过:
Sub REQUESTXML()
Dim XMLHttpRequest As xmlHttp
Dim HTMLDoc As New HTMLDocument
Dim elem As Object
Dim x As Long
Set XMLHttpRequest = New MSXML2.xmlHttp
XMLHttpRequest.Open "GET", "https://www.goaloong.net/football/6in1", False
XMLHttpRequest.send
While XMLHttpRequest.readyState = 200
DoEvents
Wend
Debug.Print XMLHttpRequest.responseText
HTMLDoc.Body.innerHTML = XMLHttpRequest.responseText
x = 1
For Each elem In HTMLDoc.getElementsByClassName("Leaguestitle")
Sheets("req").Range("A" & x).Value = HTMLDoc.getElementsByTagName("a")(0).innerText
x = x + 1
Next elem
End Sub
我没有结果.
请帮助我?
推荐答案
页面 https://www.goaloong.net/football/6in1 是动态的,即首先加载java 脚本,然后脚本加载内容.一种方法是在 IE 中加载整个页面内容并将其取出.下面的示例(已测试):
The page https://www.goaloong.net/football/6in1 is dynamic, i.e. first the java scripts are loaded, then the scripts are loading the content. One approach is to load the full page content in IE and get it out of it. Example below (tested):
Sub REQUESTXML()
Dim IE As New InternetExplorer
Dim elem As Object
Dim x As Long
IE.navigate "https://www.goaloong.net/football/6in1"
Do While IE.readyState = READYSTATE_COMPLETE: DoEvents: Loop
Do Until IE.readyState = READYSTATE_COMPLETE: DoEvents: Loop
'for debug purpose
Open ThisWorkbook.Path & "\TESTFILE.html" For Output As #1
Print #1, IE.document.body.innerHTML
Close #1
x = 1
For Each elem In IE.document.getElementsByClassName("Leaguestitle")
Sheets(1).Range("A" & x).Value = elem.innerText
x = x + 1
Next elem
IE.Quit
End Sub
这篇关于用 xmlhttp 抓取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文