VBA HTML标签层次结构 [英] VBA HTML Tag Hierarchy

查看:133
本文介绍了VBA HTML标签层次结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个简单的问题。我正在尝试编写一个程序来解析本网站的HTML p>

对于范例来说,足够的一部分源代码(第154至174行)是:

 < p>(英国飞机公司)< / p> 
< ul>
< li>< a href =/ wiki / B.A.C._Ititle =B.A.C.Iclass =mw-redirect> B.A.C。 I< / A>< /锂>
< li>< a href =/ wiki / B.A.C._IItitle =B.A.C.IIclass =mw-redirect> B.A.C。 II蛋白酶; / A>< /锂>
< li>< a href =/ wiki / B.A.C._IIItitle =B.A.C.IIIclass =mw-redirect> B.A.C。 III< / A>< /锂>
< li>< a href =/ wiki / B.A.C._IVtitle =B.A.C.IVclass =mw-redirect> B.A.C。 IV抑制剂; / A>< /锂>
< li>< a href =/ wiki / B.A.C._Vtitle =B.A.C.Vclass =mw-redirect> B.A.C。 V< / A>< /锂>
< li>< a href =/ wiki / B.A.C._VItitle =B.A.C。VIclass =mw-redirect> B.A.C。 VI< / A>< /锂>
< li>< a href =/ wiki / B.A.C._VIItitle =B.A.C。VIIclass =mw-redirect> B.A.C。 VII< / A>< /锂>
< li>< a href =/ wiki / B.A.C._VII_Mk.2title =B.A.C。VII Mk.2class =mw-redirect> B.A.C。 VII Mk.2< / a>< / li>
< li>< a href =/ wiki / B.A.C._VII_Planettetitle =B.A.C。VII Planetteclass =mw-redirect> B.A.C。 VII Planette< / a>< / li>
< li>< a href =/ wiki / B.A.C._VIIItitle =B.A.C。VIIIclass =mw-redirect> B.A.C。 VIII< / A>< /锂>
< li>< a href =/ wiki / B.A.C._VIII_Bat-Boattitle =B.A.C。VIII Bat-Boatclass =mw-redirect> B.A.C。 VIII Bat-Boat< / a>< / li>
< li>< a href =/ wiki / B.A.C._IXtitle =B.A.C。IXclass =mw-redirect> B.A.C。 IX< / A>< /锂>
< li>< a href =/ wiki / B.A.C._Cupidtitle =B.A.C。丘比特class =mw-redirect> B.A.C。丘比特< / A>< /锂>
< li>< a href =/ wiki / B.A.C._Dronetitle =B.A.C。无人机class =mw-redirect> B.A.C。无人驾驶飞机< / A>< /锂>
< li>< a href =/ wiki / B.A.C._Super_Dronetitle =B.A.C。超级无人机class =mw-redirect> B.A.C。超级无人机< / a>< / li>
< li>< a href =/ wiki / B.A._Swallow_2title =B.A。燕子2class =mw-redirect> B.A。燕子2< / a>< / li>
< li>< a href =/ wiki / B.A._Eagle_2title =B.A。Eagle 2class =mw-redirect> B.A。鹰2< / a>< / li>
< li>< a href =/ wiki / B.A._Double_Eagletitle =B.A。Double Eagleclass =mw-redirect> B.A。 Double Eagle< / a>< / li>
< / ul>

我正在尝试设计出一些东西。所以我可以到< p> HTML标签,但是我无法点击列表项来循环显示我想要的内容,因为它们进一步包含在< ul>< / ul>标签。下一步是什么?

  Sub ICE()

设置结果= IE.document。 getElementsByTagName(p)

对于每个itm在结果中
如果itm.innerHTML =(英国飞机公司)然后




结束如果
下一个itm

End Sub

为了更简洁的说明,本阶段的研究是根据 vBA解析href ron 提供



用户推荐 Doug Glancy



- >提及



我想要的是让VBA在运行时点击因为它是一个实际的链接。我正在研究ron中的代码(可以在上一个示例中看到) :

 如果itm.outerhtml =BAC VII然后
itm.Click

做直到不IE.Busy和IE.readyState = 4
DoEvents
循环
退出
结束如果

...这里使用outerHTML,但我的努力的核心是循环和逻辑运算符






我写了这段代码,但它不起作用。

 设置结果= IE。 document.getElementsByTagName(p)

对于每个itm在结果中
如果itm.innerHTML =(英国飞机公司)然后
设置Results2 = IE.document.getElementsByTagName (ul)
对于每个itm2在Results2
如果itm2.innerHTML =BAC V然后
MsgBox itm2.innerHTML
结束如果

下一个itm2
结束如果
下一个itm


解决方案

这将列出与英国飞机公司p标签下的飞机

  Sub GetAircraft()

Dim xHttp As MSXML2.XMLHTTP
Dim hDoc As MSHTML.HTMLDocument
Dim hUls As MSHTML.IHTMLElementCollection
Dim hUl As MSHTML.HTMLListElement
Dim hLi As MSHTML.HTMLLIElement

设置xHttp =新建MSXML2.XMLHTTP
xHttp.OpenGET,http://en.wikipedia.org/wiki/List_of_aircraft_% 28B%29
xHttp.send

Do
DoEvents
循环直到xHttp.readyState = 4

设置hDoc =新的HTMLDocument
hDoc.body.innerHTML = xHttp.responseText
设置hUls = hDoc.getElementsByTagName(ul)

'浏览所有< ul>标签
对于每个hUl在hUls
'只有前一个标签是
如果不是hUl.PreviousSibling是没有
'只有前一个标签是&p;
如果TypeName(hUl.PreviousSibling)=HTMLParaElement然后
'只有前面的段落被指定的文本
如果hUl.PreviousSibling.innerText =(英国飞机公司)然后
'循环通过< li>并打印出来
对于每个hLi在hUl.Children
Debug.Print hLi.innerText
下一个hLi
如果
结束If
End If
下一页hUl

End Sub


A simple question. I am trying to write a procedure to parse the HTML of this Site

A part of the source code (lines 154 to 174) that is sufficient for a paradigm is:

<p>(British Aircraft Company)</p>
<ul>
<li><a href="/wiki/B.A.C._I" title="B.A.C. I" class="mw-redirect">B.A.C. I</a></li>
<li><a href="/wiki/B.A.C._II" title="B.A.C. II" class="mw-redirect">B.A.C. II</a></li>
<li><a href="/wiki/B.A.C._III" title="B.A.C. III" class="mw-redirect">B.A.C. III</a></li>
<li><a href="/wiki/B.A.C._IV" title="B.A.C. IV" class="mw-redirect">B.A.C. IV</a></li>
<li><a href="/wiki/B.A.C._V" title="B.A.C. V" class="mw-redirect">B.A.C. V</a></li>
<li><a href="/wiki/B.A.C._VI" title="B.A.C. VI" class="mw-redirect">B.A.C. VI</a></li>
<li><a href="/wiki/B.A.C._VII" title="B.A.C. VII" class="mw-redirect">B.A.C. VII</a></li>
<li><a href="/wiki/B.A.C._VII_Mk.2" title="B.A.C. VII Mk.2" class="mw-redirect">B.A.C. VII Mk.2</a></li>
<li><a href="/wiki/B.A.C._VII_Planette" title="B.A.C. VII Planette" class="mw-redirect">B.A.C. VII Planette</a></li>
<li><a href="/wiki/B.A.C._VIII" title="B.A.C. VIII" class="mw-redirect">B.A.C. VIII</a></li>
<li><a href="/wiki/B.A.C._VIII_Bat-Boat" title="B.A.C. VIII Bat-Boat" class="mw-redirect">B.A.C. VIII Bat-Boat</a></li>
<li><a href="/wiki/B.A.C._IX" title="B.A.C. IX" class="mw-redirect">B.A.C. IX</a></li>
<li><a href="/wiki/B.A.C._Cupid" title="B.A.C. Cupid" class="mw-redirect">B.A.C. Cupid</a></li>
<li><a href="/wiki/B.A.C._Drone" title="B.A.C. Drone" class="mw-redirect">B.A.C. Drone</a></li>
<li><a href="/wiki/B.A.C._Super_Drone" title="B.A.C. Super Drone" class="mw-redirect">B.A.C. Super Drone</a></li>
<li><a href="/wiki/B.A._Swallow_2" title="B.A. Swallow 2" class="mw-redirect">B.A. Swallow 2</a></li>
<li><a href="/wiki/B.A._Eagle_2" title="B.A. Eagle 2" class="mw-redirect">B.A. Eagle 2</a></li>
<li><a href="/wiki/B.A._Double_Eagle" title="B.A. Double Eagle" class="mw-redirect">B.A. Double Eagle</a></li>
</ul>

I am in the process of trying to engineer something out. So i can get to the <p> HTML Tag but i cannot tap on the list items to loop out what i want because they are further enclosed between the <ul></ul> tags. What would be your next steps?

Sub ICE()

Set Results = IE.document.getElementsByTagName("p")

For Each itm In Results
    If itm.innerHTML = "(British Aircraft Company)" Then




    End If
Next itm

End Sub

For a more concise picture this stage of my study is based on the answer at VBA parsing of href provided by ron

Recomendation by user Doug Glancy

--> It might be helpful to mention the desired results.

What i want is to have the capability to make VBA to 'click' on runtime the href of my preference since it is an actual link. I am studying code from ron on that which is (and can be seen in the previous example):

If itm.outerhtml = "B.A.C. VII" Then
        itm.Click

        Do Until Not IE.Busy And IE.readyState = 4
            DoEvents
        Loop
        Exit For
 End If

...here outerHTML is being used however the nucleus of my effort is the loop and the logical operator


I wrote this piece of code however it does not work

Set Results = IE.document.getElementsByTagName("p")

For Each itm In Results
    If itm.innerHTML = "(British Aircraft Company)" Then
        Set Results2 = IE.document.getElementsByTagName("ul")
        For Each itm2 In Results2
            If itm2.innerHTML = "B.A.C. V" Then
                MsgBox itm2.innerHTML
            End If

        Next itm2
    End If
Next itm

解决方案

This will list out the aircraft under the p tag with British Aircraft Company

Sub GetAircraft()

    Dim xHttp As MSXML2.XMLHTTP
    Dim hDoc As MSHTML.HTMLDocument
    Dim hUls As MSHTML.IHTMLElementCollection
    Dim hUl As MSHTML.HTMLListElement
    Dim hLi As MSHTML.HTMLLIElement

    Set xHttp = New MSXML2.XMLHTTP
    xHttp.Open "GET", "http://en.wikipedia.org/wiki/List_of_aircraft_%28B%29"
    xHttp.send

    Do
        DoEvents
    Loop Until xHttp.readyState = 4

    Set hDoc = New HTMLDocument
    hDoc.body.innerHTML = xHttp.responseText
    Set hUls = hDoc.getElementsByTagName("ul")

    'Go through all the <ul> tags
    For Each hUl In hUls
        'Only if previous tag is something
        If Not hUl.PreviousSibling Is Nothing Then
            'Only if previous tag is <p>
            If TypeName(hUl.PreviousSibling) = "HTMLParaElement" Then
                'Only if previous paragraph is specified text
                If hUl.PreviousSibling.innerText = "(British Aircraft Company)" Then
                    'loop through the <li> and print them out
                    For Each hLi In hUl.Children
                        Debug.Print hLi.innerText
                    Next hLi
                End If
            End If
        End If
    Next hUl

End Sub

这篇关于VBA HTML标签层次结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆