在VBA(excel)中从HTML源获取数据 [英] Getting data from HTML source in VBA (excel)

查看:119
本文介绍了在VBA(excel)中从HTML源获取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从网站收集数据,一旦源代码以字符串形式出现,该网站应该是可管理的。看看我已经组装了一些可能的解决方案,但遇到所有这些问题:

I'm trying to collect data from a website, which should be manageable once the source is in string form. Looking around I've assembled some possible solutions but have run into problems with all of them:


  1. 使用InternetExplorer.Application打开网址和然后访问内部HTML

  2. Inet

  3. 使用Shell命令运行wget

以下是我遇到的问题:


  1. 当将innerHTML存储到字符串中时,不是整个来源,只有一部分

  2. ActiveX不允许创建Inet对象(错误429)

  3. 我已经将htm文件夹在我的电脑上,如何把它变成VBA中的字符串?

代码1:

Sub getData()
Dim url As String,ie As Object,state As Integer
Dim text As Variant,startS As Integer,endS As Integer

Sub getData() Dim url As String, ie As Object, state As Integer Dim text As Variant, startS As Integer, endS As Integer

Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = 0

url = "http://www.eoddata.com/stockquote/NASDAQ/AAPL.htm"
ie.Navigate url


state = 0
Do Until state = 4
    DoEvents
    state = ie.readyState
Loop


text = ie.Document.Body.innerHTML
startS = InStr(ie.Document.Body.innerHTML, "7/26/2012")
endS = InStr(ie.Document.Body.innerHTML, "7/25/2012")


text = Mid(ie.Document.Body.innerHTML, startS, endS - startS)

MsgBox text


推荐答案

p>如果我试图从该页面的08/10/12开始开价,这与我假设你正在做的类似,我会这样做:

If I were trying to pull the opening price off from 08/10/12 off of that page, which is similar to what I assume you are doing, I'd do something like this:

    Set ie = New InternetExplorer
    With ie
        .navigate "http://eoddata.com/stockquote/NASDAQ/AAPL.htm"
        .Visible = False
        While .Busy Or .readyState <> READYSTATE_COMPLETE
           DoEvents
        Wend
        Set objHTML = .document
        DoEvents
    End With
    Set elementONE = objHTML.getElementsByTagName("TD")
    For i = 1 To elementONE.Length
        elementTWO = elementONE.Item(i).innerText           
        If elementTWO = "08/10/12" Then
            MsgBox (elementONE.Item(i + 1).innerText)
            Exit For
        End If
    Next i
    DoEvents
    ie.Quit
    DoEvents
    Set ie = Nothing

您可以修改此内容以运行HTML并拉出所需的任何数据。迭代+2将返回高价格等。

You can modify this to run through the HTML and pull whatever data you want. Iteration +2 would return the high price, etc.

由于该页面上有很多日期,您可能还需要检查它是否在最近结束日价格和公司简介。

Since there are a lot of dates on that page you might also want to make it check that it is between the Recent End of Day Prices and the Company profile.

这篇关于在VBA(excel)中从HTML源获取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆