VBA - 抓取没有 id 的 HTML 表格 [英] VBA - Scrape HTML table without id
问题描述
我正在尝试使用 VBA 从 html 表中获取数据.从列表框中选择一个值,填充一个文本框并单击一个按钮后,表格就会出现.但是网站的url并没有改变.
我的程序确实填充了框,选择列表框值并单击搜索"按钮,但是我无法从表中获取数据.
我需要页面末尾表格单元格的值.(第二个 < t d >)
这里是
<小时>参考:
通过 VBE > 工具 > 参考的 HTML 对象库
<小时>根据您的代码大纲进行调整,但仍在使用 pestania = 27
选项显式公共子 GetInfo()Dim ie As New InternetExplorer, hTable As HTMLTable, lista() As String, id As String, rut As String, enlace As StringApplication.ScreenUpdating = Falseid = Worksheets("Lista").Cells(2, 1).Valuelista = Split(id, "-")车辙=列表(0)enlace = "http://www.cmfchile.cl/institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=" &车辙&&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw+cAAhAABP4MAAz&control=svs&pestania=27"与即.可见 = 真.navigate enlace '"http://www.cmfchile.cl/institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=9278&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw%20cAAhAABP4MAAz&control=svs&pestania=27"虽然 .Busy 或 .readyState <4:DoEvents:温德.document.getElementById("aa").Value = 2017.document.forms("consulta").submit做事件出错时继续下一步Set hTable = .document.getElementsByTagName("table")(1)出错时转到 0循环而 hTable 什么都不是WriteTable hTable, 1, ActiveSheet结束于Application.ScreenUpdating = True结束子
I am trying to get data from a html table with VBA. After selecting a value from a list box, filling a text box and clicking a button the table appears. But the url of the website does not change.
My program does fill the box, select the list box value and click the "search" button, but then I can't get the data from the table.
I need the values of the table's cells at the end of the page. (second < t d >)
Here's the url of the page:
Code:
Sub Info()
Dim enlace As String
Dim id As String
Dim lista
Dim rut As Integer
Dim i As Integer
Dim largo As Integer
largo = Worksheets("Lista").Cells(rows.Count, 1).End(xlUp).Row
id = Worksheets("Lista").Cells(2, 1).Value
lista = Split(id, "-")
rut = lista(0)
enlace = "http://www.cmfchile.cl/institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=" & rut & "&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw+cAAhAABP4MAAz&control=svs&pestania=1"
Set objIE = CreateObject("InternetExplorer.application")
objIE.Visible = False
objIE.Navigate (enlace)
Do
If objIE.ReadyState = 4 Then
objIE.Visible = False
Exit Do
Else
DoEvents
End If
Loop
Dim button_name As String
button_name = "Aportantes"
Set link = objIE.document.getElementsByTagName("A")
For Each Hyperlink In link
If InStr(Hyperlink.innerText, button_name) > 0 Then
Hyperlink.Click
Exit For
End If
Next
Dim nuevoLink As String
nuevoLink = Hyperlink
objIE.Quit
Set ie = CreateObject("InternetExplorer.application")
ie.Visible = False
ie.Navigate (nuevoLink)
Do
If ie.ReadyState = 4 Then
ie.Visible = False
Exit Do
Else
DoEvents
End If
Loop
Dim sem As String
Dim ano As Integer
sem = "03"
ano = 2018
Dim aportantes As Object
Dim cuotas_emitidas As Object
ie.document.getElementById("semestre").Value = sem
ie.document.getElementById("aa").Value = ano
Set elems = ie.document.getElementsByTagName("input")
For Each e In elems
If (e.getAttribute("value") = "Consultar") Then
e.Click
''HERE IS THE PROBLEM
Set aportantes = ie.document.getElementsByTagName("table")(1).getElementsByTagName("tr")(0).getElementsByTagName("tr")(1)
ThisWorkbook.Worksheets("Lista").Cells(i, 4).Value = aportantes
Set cuotas_emitidas = ie.document.getElementsByTagName("table")(1).getElementsByTagName("tr")(1).getElementsByTagName("tr")(1).innerText
ThisWorkbook.Worksheets("Lista").Cells(i, 5).Value = cuotas_emitidas
End If
Next e
End Sub
HTML:
<table>
<tbody>
<tr>
<td class="fondoOscuro">2.01.60 TOTAL APORTANTES</td>
<td>58</td>
</tr>
<tr>
<td class="fondoOscuro">2.01.70 CUOTAS EMITIDAS</td>
<td>20000000 </td>
</tr>
<tr>
<td class="fondoOscuro">2.01.71 CUOTAS PAGADAS</td>
<td>7691000</td>
</tr>
<tr>
<td class="fondoOscuro">2.01.72 CUOTAS SUSCRITAS Y NO PAGADAS</td>
<td>0 </td>
</tr>
<tr>
<td class="fondoOscuro">2.01.73 NUMERO DE CUOTAS CON PROMESA DE SUSCRIPCION Y PAGO</td>
<td>0 </td>
</tr>
<tr>
<td class="fondoOscuro">2.01.74 NUMERO DE CONTRATOS DE PROMESAS DE SUSCRIPCION Y PAGO</td>
<td>0</td>
</tr>
<tr>
<td class="fondoOscuro">2.01.75 NUMERO DE PROMITENTES SUSCRIPTORES DE CUOTAS</td>
<td>0 </td>
</tr>
<tr>
<td class="fondoOscuro">2.01.80 VALOR LIBRO DE LA CUOTA</td>
<td>1.0059 </td>
</tr>
</tbody></table>
'
XHR:
You can do the whole thing with XHR and scrape without opening a browser. Change the Activesheet output to the sheet you want to write the table to (WriteTable hTable, 1, ActiveSheet
).
Note the arguments for the POST body include:
- mm=12 # months
- aa=2017 year
- rut=9278 rut code
Code:
Public Sub GetTable()
Dim sResponse As String, hTable As Object, id As String, lista() As String, rut As String
Dim strBody As String
id = Worksheets("Lista").Cells(2, 1).Value
lista = Split(id, "-")
rut = lista(0)
strBody = "mm=12&aa=2017&rut=" & rut
With CreateObject("MSXML2.XMLHTTP")
.Open "POST", "http://www.cmfchile.cl//institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=9278&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw%20cAAhAABP4MAAz&control=svs&pestania=27", False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send strBody
sResponse = StrConv(.responseBody, vbUnicode)
End With
sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
With CreateObject("htmlFile")
.Write sResponse
Set hTable = .getElementsByTagName("table")(1)
End With
Application.ScreenUpdating = False
WriteTable hTable, 1, ActiveSheet
Application.ScreenUpdating = True
End Sub
Public Sub WriteTable(ByVal hTable As Object, Optional ByVal startRow As Long = 1, Optional ByVal ws As Worksheet)
If ws Is Nothing Then Set ws = ActiveSheet
Dim tSection As Object, tRow As Object, tCell As Object, tr As Object, td As Object, R As Long, C As Long, tBody As Object
R = startRow
With ws
Set tBody = hTable.getElementsByTagName("tbody")
For Each tSection In tBody 'HTMLTableSection
Set tRow = tSection.getElementsByTagName("tr") 'HTMLTableRow
For Each tr In tRow
R = R + 1
Set tCell = tr.getElementsByTagName("td")
C = 1
For Each td In tCell 'DispHTMLElementCollection
.Cells(R, C).Value = td.innerText 'HTMLTableCell
C = C + 1
Next td
Next tr
Next tSection
End With
End Sub
With browser (using WriteTable sub from above as well)
Option Explicit
Public Sub GetInfo()
Dim ie As New InternetExplorer, hTable As HTMLTable
Application.ScreenUpdating = False
With ie
.Visible = True
.navigate "http://www.cmfchile.cl/institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=9278&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw%20cAAhAABP4MAAz&control=svs&pestania=27"
While .Busy Or .readyState < 4: DoEvents: Wend
.document.getElementById("aa").Value = 2017
.document.forms("consulta").submit
Do
DoEvents
On Error Resume Next
Set hTable = .document.getElementsByTagName("table")(1)
On Error GoTo 0
Loop While hTable Is Nothing
WriteTable hTable, 1, ActiveSheet
End With
Application.ScreenUpdating = True
End Sub
Output:
References:
HTML Object library via VBE > Tools > References
Adjusting to your code outline but still using pestania = 27
Option Explicit
Public Sub GetInfo()
Dim ie As New InternetExplorer, hTable As HTMLTable, lista() As String, id As String, rut As String, enlace As String
Application.ScreenUpdating = False
id = Worksheets("Lista").Cells(2, 1).Value
lista = Split(id, "-")
rut = lista(0)
enlace = "http://www.cmfchile.cl/institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=" & rut & "&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw+cAAhAABP4MAAz&control=svs&pestania=27"
With ie
.Visible = True
.navigate enlace '"http://www.cmfchile.cl/institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=9278&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw%20cAAhAABP4MAAz&control=svs&pestania=27"
While .Busy Or .readyState < 4: DoEvents: Wend
.document.getElementById("aa").Value = 2017
.document.forms("consulta").submit
Do
DoEvents
On Error Resume Next
Set hTable = .document.getElementsByTagName("table")(1)
On Error GoTo 0
Loop While hTable Is Nothing
WriteTable hTable, 1, ActiveSheet
End With
Application.ScreenUpdating = True
End Sub
这篇关于VBA - 抓取没有 id 的 HTML 表格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!