从快递网站使用VBA进行网页搜刮 [英] Web Scraping using VBA from courier website
问题描述
我想在excel列A中记录跟踪编号的包裹,并在其他列上记录可用字段的详细信息,以便每当我按下按钮运行模块时,它都会为我提供有关从网站获取的包裹详细信息的信息.我定位的网站是" http://trackandtrace.courierpost.co.nz/Search/一个>". 我已经编写了在链接后嵌入跟踪号并获取其他字段的代码,但是代码未获取任何数据,只是使用Internet Explorer打开了链接.我得到的错误是
I want to have a record of parcels having tracking numbers in excel column A and details of the available field on other columns so that whenever I press the button to run module it updates me about the parcel details fetching from the website. The website I am targeting is "http://trackandtrace.courierpost.co.nz/Search/". I have made the code to embed tracking number after that link and to fetch other fields but the code is not fetching any data it just opens up the link using internet explorer. The error i get is
这是我的代码:
Sub Yellowcom()
'Dim ieObj As InternetExplorer
Dim htmlELe As IHTMLElement
Dim HTML As HTMLDocument
Dim i As Integer
Dim x As Integer
Dim URL As String
Dim URLParameter As String
Dim page As Long
Dim links As Object
Dim IE As Object
i = 1
Set IE = CreateObject("InternetExplorer.Application")
'Set ieObj = New InternetExplorer
IE.Visible = True
URL = "http://trackandtrace.courierpost.co.nz/search/"
'Application.Wait Now + TimeValue("00:00:05")
x = 1
For page = 2 To 10
If page > 1 Then URLParameter = Sheet1.Cells(x, 1).Value
IE.navigate URL & URLParameter
' Wait for the browser to load the page
Do Until IE.readyState = 4
DoEvents
Loop
Set HTML = IE.document
Set OrganicLinks = HTML.getElementsByClassName("search-results organic")
Set links = OrganicLinks.Item(0).getElementsByClassName("info")
For Each htmlELe In links
With ActiveSheet
.Range("A" & i).Value = htmlELe.Children(0).textContent
On Error Resume Next
.Range("B" & i).Value = htmlELe.getElementsByClassName("track-visit-website")(0).href
On Error GoTo 0
On Error Resume Next
.Range("C" & i).Value = htmlELe.getElementsByClassName("info-section info-secondary")(0).href
On Error GoTo 0
'.Range("B" & i).Value = htmlELe.getElementsByTagName("a")(0).href
'.Range("C" & i).Value = htmlELe.Children(2).textContent
.Range("D" & i).Value = htmlELe.Children(2).querySelector("a[href]")
'links2 = htmlELe.getElementsByClassName("links")(1)
' .Range("D" & i).Value = links2.href
End With
i = i + 1
x = x + 1
Next htmlELe
Next page
IE.Quit
Set IE = Nothing
End Sub
推荐答案
我强烈建议您使用后台对象将信息发送到网站,例如以下MSXML2对象可用于发送GET和POST请求,在以下代码中,我使用搜索代码(从A列中的值中提取)向您的网站发送请求,然后将所需的传递状态和时间xml放入B列
I highly recommend you use background objects to send info to websites, e.g. the following MSXML2 objects can be used to send GET and POST requests, in the following code I'm sending a request to your website with the search code (pulled from values in column A) and then putting your required delivery status and time xml in column B
Sub demoMsxml2()
Dim mySearchCode As String
Dim myConnection As Object
Dim Status As String
Dim i As Long
Dim lastRow As Long
lastRow = Sheet1.Cells(Sheet1.Rows.Count, 1).End(xlUp).Row
For i = 1 To lastRow
mySearchCode = Sheet1.Range("A" & i).Value2
Set myConnection = CreateObject("MSXML2.ServerXMLHTTP.6.0")
Call myConnection.Open("GET", "http://trackandtrace.courierpost.co.nz/Search/" + mySearchCode)
myConnection.send
Sheet1.Range("B" & i).Value2 = ExtractString(Trim(Replace(myConnection.responseText, vbCrLf, "")), "<li class=""status""><span", "</li>")
Next i
End Sub
Function ExtractString(parentString As String, beginsWith As String, endsWith As String) As String
Dim a As Long: a = InStr(1, parentString, beginsWith)
Dim b As Long: b = InStr(a, parentString, endsWith)
If (a <> 0 And b <> 0) Then ExtractString = Trim(Mid(parentString, a, b - a)) Else ExtractString = ""
End Function
除了将文本放入B列外,您还可以从中抓取数据.使用此方法意味着您无需在屏幕上看到任何东西,无需创建Internet Explorer实例,无需等待页面加载等.所有这些都将自动处理.
Instead of putting the text into column B you can just scrape your data from it. Using this method means you don't have to see anything on the screen, no creating internet explorer instances, no waiting for pages to load etc. it's all handled automatically.
这篇关于从快递网站使用VBA进行网页搜刮的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!