使用vba从网络中检索数据 [英] Retrieving data from the web using vba

查看:108
本文介绍了使用vba从网络中检索数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

刚开始使用html,在vba中相当有能力,但是在连接两个文件时遇到一些问题。



我已经通过注册到网站并尝试获取结果。
代码到目前为止

  Dim HTMLDoc As HTMLDocument 
Dim MyBrowser As InternetExplorer
Sub GetVehicleDetails ()

Dim MyHTML_Element As IHTMLElement
Dim MyURL As String
Dim x As Integer
On Error GoTo Err_Clear
MyURL =http:// www 1.stchoice.co.uk / find-a-part
x = 0
设置MyBrowser =新的InternetExplorer
MyBrowser.Silent = True
MyBrowser.navigate MyURL
MyBrowser .Visible = True
Do
循环直到MyBrowser.readyState = READYSTATE_COMPLETE
设置HTMLDoc = MyBrowser.document
HTMLDoc.all.license_plate.Value =LV11VYT

对于每个MyHTML_Element在HTMLDoc.getElementsByTagName(button)'(input)
'获取第二个按钮
如果MyHTML_Element.Title =继续然后'MyHTML_Element.Click:退出
x = x + 1
如果x = 2然后
MyHTML_Element.Click
结束如果
结束如果
下一个
Err_Clear:
如果Err<&g吨; 0然后
Err.Clear
简历下一个
结束如果
结束子

现在我需要等到页面刷新,然后得到结果,但我不确定如何拉出结果



源代码是

 < div id =block_subheaderclass =block_editable block_wysiwyg> 
< p>几乎有! < strong> TELL US< / strong>& nbsp;您需要哪些部件 - < strong> ADD& nbsp;< / strong>您的联系方式& amp; amp;接收< strong>无义务报价< / strong>< span style =font-weight:normal;>& nbsp;比较& < / span>< span style =font-weight:normal;>< strong> Save& pound;& pound;'s!< / strong>< / span>< / p> ;
< / div>
< div class =clear>< / div>
< form id =step3action =/ find-a-part / step-3method =postenctype =multipart / form-data>
< div class =clearfix>
< h2> RENAULT MEGANE(X95)DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc(2011)DIESEL< / h2>
< p>< a href =/ find-a-part / step-2>不是你要搜索的车辆?点击此处,直接指定车辆< / a>< / p>
< / div>

试图获得雷诺Megane细节



任何人都可以帮助吗?



确定我已经过了这部分,但遇到另一个问题,当页面更改后按钮被点击我需要更新的HTML .document到新页面,当我在代码中使用它提取旧的源代码。



我可以让它工作,但它只适用于一个消息盒子激活以说明浏览器名称是什么。



任何建议?

  Dim HTMLDoc As HTMLDocument 
Dim MyBrowser As InternetExplorer

Sub GetVehicleDetails2()

Dim MyHTML_Element As IHTMLElement
Dim HTMLDoc As HTMLDocument,Doc As HTMLDocument
Dim MyURL As String,Vehicle As String
Dim x As Integer,y As Integer
On Error GoTo Err_Clear
MyURL =http://www.1stchoice.co.uk / find-a-part
x = 0
'打开新的资源管理器
设置MyBrowser =新的InternetE xplorer
MyBrowser.Silent = True
'导航到页
MyBrowser.navigate MyURL
MyBrowser.Visible = True
'等到准备
尽管MyBrowser .BUSY或_
MyBrowser.readyState<> 4
DoEvents
循环
Do
循环直到MyBrowser.readyState = READYSTATE_COMPLETE
设置HTMLDoc = MyBrowser.document

'输入文本注册框
HTMLDoc.all.license_plate.Value =LV11VYT

'点击继续按钮
设置MyHTML_Element = HTMLDoc.getElementsByTagName(button)(1)
MyHTML_Element.Click
设置HTMLDoc = Nothing
'等待页面更新

设置Doc = MyBrowser.document
'Application.Wait(Now()+00: 00:05)

'不行,如果你把这个
MsgBox MyBrowser.FullName

'查找文本返回车辆详细信息
对于每个MyHTML_Element在Doc.getElementsByTagName(form)
如果MyHTML_Element.ID =step3然后
Vehicle = MyHTML_Element.innerText
MsgBox Vehicle
End If
Next
'关闭浏览器下来
'MyBrowser.Quit

Err_Clear:
如果Err 0然后
Err.Clear
简历下一个
结束如果
结束子

使用2003或2007,尝试网络查询,不能传递价值&使用继续按钮。

解决方案

不尝试启动参数,使用Regex(与解析器)从HTML中提取元素,正则表达式是一个简单的方式来提取你需要的元素,因为它是明确的,你只需要这个元素。



你可以做一些类似(和我提供一个替代方法只是使用InStr,它适用于你的例子,但是如果有很多结果一次返回或语法更改等等,那么正则表达式会更灵活):

  Sub blah()

Dim testStr As String

'您在问题中提供的测试字符串>替换为您的HTML返回
testStr = ActiveSheet.Cells(1,1).Value

'方法1:使用一个简单的Instr(对于您提供的示例,但如果不同位你需要搜索更复杂,那么你可能需要使用Regex而不是

Dim startLocation As Long,endLocation As Long
Dim extractedText As String

startLocation = InStr (1,testStr,< h2>,vbTextCompare)

如果不是startLocation> 0然后

退出Sub'或移动到下一个或任何

Else

endLocation = InStr(startLocation,testStr,< / h2>,vbTextCompare)

extractedText = Mid(testStr,startLocation + 4,endLocation - startLocation - 4)

Debug.PrintBasic InStr方法:; extractedText

结束如果

'方法2:使用正则表达式

'更灵活 - >引用正则表达式引擎
'此示例使用Microsoft VBScript R egular Expressions 5.5
'该引擎与MS JavaScript regex相同的语法
'请参阅http://msdn.microsoft.com/en-us/library/1400241x.aspx语法

Dim regex As RegExp
Dim match As match

Set regex = New RegExp

带正则表达式

.Pattern = (?:< h2)([\s\S] *?)(?=< / h2>)
'NB这个正则表达式引擎不支持lookbehinds :-(
'所以我们必须提取我们想要的
'只需使用Match.Value)
.IgnoreCase = True
.MultiLine = True

每个匹配在.Execute(testStr)

Debug.Print 正则表达式匹配:; match.SubMatches.Item(0)

下一个匹配

结束

End Sub
/ pre>

输出是:

 基本的InStr方法:RENAULT MEGANE (X95)DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc(2011)DIESEL 
正则表达式:RENAULT MEGANE(X95)DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc(2011)DIESEL


Just started using html, reasonably capable in vba but having some problems linking the two.

I have passed a registration to a web site and trying to get the results. code used so far

Dim HTMLDoc As HTMLDocument
 Dim MyBrowser As InternetExplorer
  Sub GetVehicleDetails()

  Dim MyHTML_Element As IHTMLElement
  Dim MyURL As String
  Dim x As Integer
  On Error GoTo Err_Clear
  MyURL = "http://www.1stchoice.co.uk/find-a-part"
  x = 0
  Set MyBrowser = New InternetExplorer
  MyBrowser.Silent = True
  MyBrowser.navigate MyURL
  MyBrowser.Visible = True
  Do
  Loop Until MyBrowser.readyState = READYSTATE_COMPLETE
  Set HTMLDoc = MyBrowser.document
  HTMLDoc.all.license_plate.Value = "LV11VYT"

  For Each MyHTML_Element In HTMLDoc.getElementsByTagName("button") '("input")
  'Get 2nd button
   If MyHTML_Element.Title = "Continue" Then 'MyHTML_Element.Click: Exit For
    x = x + 1
    If x = 2 Then
    MyHTML_Element.Click
    End If
   End If
  Next
Err_Clear:
  If Err <> 0 Then
  Err.Clear
  Resume Next
  End If
  End Sub

Now I need to wait until page is refreshed and then get the result, but I'm not certain how to pull the result out

Source code is

<div id="block_subheader" class="block_editable block_wysiwyg">
<p>Almost there! <strong>TELL US</strong>&nbsp;which parts you need - <strong>ADD&nbsp;</strong>your contact details &amp; receive <strong>No Obligation Quotes</strong><span style="font-weight: normal;">&nbsp;to compare &amp; </span><span style="font-weight: normal;"><strong>Save &pound;&pound;'s!</strong></span></p>                      
</div>
<div class="clear"></div>
<form id="step3" action="/find-a-part/step-3" method="post" enctype="multipart/form-data">
<div class="clearfix">
<h2>RENAULT MEGANE (X95) DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc (2011) DIESEL</h2>
<p><a href="/find-a-part/step-2">Not quite the vehicle you're searching for? Click here to specify the vehicle exactly</a></p>
</div>

Trying to get the Renault Megane details

Can anyone help please?

OK I have got past this part but have run into another problem, when the page changes after the button is clicked I need to update the html.document to the new page as when I use it in the code it pulls up the old source code.

I can get it to work but It only works with a message box activating to say what the browser name is.

Any suggestions?

Dim HTMLDoc As HTMLDocument
 Dim MyBrowser As InternetExplorer

Sub GetVehicleDetails2()

  Dim MyHTML_Element As IHTMLElement
  Dim HTMLDoc As HTMLDocument, Doc As HTMLDocument
  Dim MyURL As String, Vehicle As String
  Dim x As Integer, y As Integer
  On Error GoTo Err_Clear
  MyURL = "http://www.1stchoice.co.uk/find-a-part"
  x = 0
  'open new explorer
  Set MyBrowser = New InternetExplorer
  MyBrowser.Silent = True
  'navigate to page
  MyBrowser.navigate MyURL
  MyBrowser.Visible = True
  'wait until ready
  Do While MyBrowser.Busy Or _
  MyBrowser.readyState <> 4
  DoEvents
  Loop
  Do
  Loop Until MyBrowser.readyState = READYSTATE_COMPLETE
    Set HTMLDoc = MyBrowser.document

    'enter registration in text box
    HTMLDoc.all.license_plate.Value = "LV11VYT"

    'click continue button
    Set MyHTML_Element = HTMLDoc.getElementsByTagName("button")(1)
    MyHTML_Element.Click
    Set HTMLDoc = Nothing
    'wait until page updated

    Set Doc = MyBrowser.document
    'Application.Wait (Now() + "00:00:05")

    'does not work if you take this out
    MsgBox MyBrowser.FullName

    'find text returned with vehicle details
    For Each MyHTML_Element In Doc.getElementsByTagName("form")
      If MyHTML_Element.ID = "step3" Then
        Vehicle = MyHTML_Element.innerText
        MsgBox Vehicle
      End If
    Next
  'close browser down
 'MyBrowser.Quit

Err_Clear:
  If Err <> 0 Then
  Err.Clear
  Resume Next
  End If
  End Sub

using 2003 or 2007, tried web queries, cant pass value & use continue button.

解决方案

Without trying to start an argument over extracting an element from HTML using Regex (vs. a parser) but Regex would be an easy way to extract the element you need, as it is well-defined and you only need that element.

You could do something like (and I provide an alternative way just using InStr, that works for your example but if there are lots of results returned at once or syntax changes etc then Regex would be more flexible):

Sub blah()

    Dim testStr As String

    'test string you provided in the Question -> substitute it for your HTML return
    testStr = ActiveSheet.Cells(1, 1).Value

'Method 1: Use a simple Instr (fine for the example you provided, but if different bits you need to search are more complicated then you may need to use Regex instead

    Dim startLocation As Long, endLocation As Long
    Dim extractedText As String

    startLocation = InStr(1, testStr, "<h2>", vbTextCompare)

    If Not startLocation > 0 Then

        Exit Sub 'or move to next or whatever

    Else

        endLocation = InStr(startLocation, testStr, "</h2>", vbTextCompare)

        extractedText = Mid(testStr, startLocation + 4, endLocation - startLocation - 4)

        Debug.Print "Basic InStr method: "; extractedText

    End If

'Method 2: Use Regex

    'more flexible -> reference a Regex engine.
    'This example uses Microsoft VBScript Regular Expressions 5.5
    'That engine uses the same syntax as MS JavaScript regex
    'See http://msdn.microsoft.com/en-us/library/1400241x.aspx for syntax

    Dim regex As RegExp
    Dim match As match

    Set regex = New RegExp

    With regex

        .Pattern = "(?:<h2>)([\s\S]*?)(?=</h2>)"
        'NB this regex engine does not support lookbehinds :-(
        'so we have to extract the submatched group for what we want
        '(vs. just using Match.Value)
        .IgnoreCase = True
        .MultiLine = True

        For Each match In .Execute(testStr)

            Debug.Print "Regex match: "; match.SubMatches.Item(0)

        Next match

    End With

End Sub

Output is:

Basic InStr method: RENAULT MEGANE (X95) DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc (2011) DIESEL
Regex match: RENAULT MEGANE (X95) DYNAMIQUE TOMTOM DCI ECO 3 DOOR COUPE 1461cc (2011) DIESEL

这篇关于使用vba从网络中检索数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆