使用VBA将特定的Web数据导入excel [英] Importing specific web data to excel using VBA

查看:186
本文介绍了使用VBA将特定的Web数据导入excel的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我非常喜欢VBA编码场景(网页脚本更多是我的东西),但是我有一个基于excel的程序,我需要创建,将数据从Intranet网络应用程序导入到电子表格中。以下是我要设置的要点...
在电子表格中,用户将输入以下信息:用户名,密码,客户帐号列表和日期范围。然后,用户将点击命令按钮,这将导致以下情况:


  1. 打开基于网络的程序,登录(基于在登录/密码输入到电子表格中)并导航到帐户搜索屏幕。


  2. 在搜索字段中输入第一个客户帐号,然后点击搜索按钮导航到特定的客户帐户。


  3. 导航到搜索活动屏幕,输入日期范围,然后点击搜索活动按钮。


  4. 从活动表的特定列拉取数据,并将数据导入电子表格。


  5. 如果有多页数据将会有一个下一个结果按钮,应该有一个循环来单击下一个结果按钮(如果存在),并从每个页面拉同一列数据,直到该按钮不再存在(没有更多数据)


  6. 一旦没有更多的数据页面(或者只有一个页面),宏将会请返回并导航到帐户搜索屏幕,并对输入到电子表格的帐户列表中的每个帐户执行相同的操作,直到没有其他帐户。


  7. 一旦完成(所有数据成功导入电子表格),它应该关闭IE窗口。


这有点复杂我认识到excel / vba绝对不是执行这些功能的最好的解决方案,但不幸的是这是我在这种情况下必须使用的。我已经能够将几乎所有上述的VBA组合在一起,我遇到的问题是循环访问活动页面,并且拉动数据将无法正常工作(获得广泛的错误,只会让我更混淆)有时它会从第一张表中拉出数据,点击下一个结果按钮,进入下一页,然后抛出一个错误,甚至通过两到三页的错误。这不是很有意义,但最常见的错误是许可被拒绝。而且这个代码目前只能从一个帐户中提取数据,我希望一旦我在一个帐户中工作,那么创建一个循环的整个代码就可以很简单,让它掉到帐号列表中,并且做同样的事情每个直到完成。我已经被困在这几个星期了,我真的准备好抛出整个事情,从头开始,任何帮助将非常感谢!



以下是我到目前为止的代码...

  Private Sub CommandButton1_Click()

'打开IE,导航到所需的页面并循环直到完全加载
设置IE =新的InternetExplorerMedium
my_url =https://customerinfo/pages/login.jsp
my_url2 =https ://customerinfo/pages/searchCustomer.jsp
my_url3 =https://customerinfo/pages/searchAccountActivity.jsp

与IE
.Visible = True
.navigate my_url
Do Until Not .Busy And .readyState = 4
DoEvents
循环
结束

'输入用户名和密码
IE.document.getElementById(userId)Value = [B2]
IE.document.getElementById(password)。值= [B3]

'点击t他登录按钮
IE.document.getElementById(action)。点击
直到不IE.Busy和IE.readyState = 4
DoEvents
循环

'导航到搜索屏幕
与IE
.navigate my_url2
做直到不.Busy和.readyState = 4
DoEvents
循环
结束

'输入帐号&点击搜索
IE.document.getElementById(accountNumber)。Value = [B5]
IE.document.getElementById(action)。点击
直到不IE.Busy和IE .readyState = 4
DoEvents
循环

与IE
.navigate my_url3
直到不是.Busy And .readyState = 4
DoEvents
循环
结束

'输入搜索条件
IE.document.getElementById(store)。值= [C7]
IE.document .getElementById(dateFromMonth)Value = [C10]
IE.document.getElementById(dateFromDay)。Value = [B11]
IE.document.getElementById(dateFromYear)Value = [B12]
IE.document.getElementById(timeFromHour)。Value = [B20]
IE.document.getElementById(timeFromMinute)。Value = [B21]
IE.document .getElementById(dateToMonth)。Value = [C15]
IE.document.getElementById(dateToDay)。Value = [B16]
IE.document.getElementById(dateToYear)。 = [B17]
IE.document.getElementById(timeToHour)。Value = [B24]
IE.document.getElementById(timeToMinute)。Value = [B25]
IE。 document.getElementById(action)。单击
直到不是IE.Busy和IE.readyState = 4
DoEvents
循环

'从活动搜索中提取数据
Dim TDelements As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim r As Long,i As Long
Dim e As Object

Application.Wait Now + TimeValue (00:00:05)
设置TDelements = IE.document.getElementsByTagName(tr)
r = 0
对于i = 1到1
Application.Wait Now + TimeValue(00:00:03)
对于TDelements中的每个TD元素
如果TDelement.className =searchActivityResultsOldContent然后
Sheet1.Range(E1)。Offset(r, 0).Value = TDelement.ChildNodes(8).innerText
r = r + 1
ElseIf TDelement.className =然后
Sheet1.Range(E1)。Offset(r,0).Value = TDelement.ChildNodes(8).innerText
r = r + 1
End If
Next
Application.Wait Now + TimeValue(00:00:02)
设置elems = IE.document.getElementsByTagName(input)
对于每个e在elems
如果e.Value =Next Results然后
e.Click
i = 0
退出
结束如果
下一步e
下一步i

Do Until Not IE.Busy And IE.readyState = 4
DoEvents
Loop
IE.Quit

End Sub


解决方案

所以,点击下一步... 元素?让我描述一个我遇到的问题。假设代码流程如下:


  1. 创建IE实例,并导航到某个URL,e。 G。第一个搜索结果页。

  2. 检查页面是否已加载并准备就绪。等待它。

  3. 创建目标元素的 DispHTMLElementCollection 集合,由 .document.getElementsByTagName( )等。

  4. 循环收集的元素,做一些事情。

  5. 点击下一个...元素。问题是在某些情况下,由于某些JS或XHR处理,下一页不会立即开始下载。

  6. 如果下一页已经加载并准备好,请进行常规检查。此检查只允许进一步的代码执行没有任何延迟,因为下一页的下载没有在点击之后立即开始,并且当前现有的页面被确定为下一页下载并准备好错误。简单的几秒延迟不能提供可靠的方式来获取准备好的页面。

  7. 再次,创建元素的 DispHTMLElementCollection 集合现有页面,而不是下一页,错误地。

  8. 循环创建集合的元素。当循环进行中时,下一页开始下载。该集合仍然包含对对象的引用,但实际上该对象的页面已被卸载。因此,尝试访问卸载页面的元素或由于文档对象的元素无响应,操作会给出许可被拒绝错误。

我的线索是避免点击下一步...,尝试从 .href 属性的Next ...锚点中读取下一页URL < a> 元素,并调用 IE.navigate 到该URL,然后检查页面的准备状态。



查看实施该方法的示例



IMO最有效的方法是使用XHR,如


I'm very much beginner to the VBA coding scene (web scripting is more my thing) but I have an excel based program I need to create that will import data from a intranet web based application into a spreadsheet. Here's the gist of what I'm looking to set up... In the spreadsheet the user will enter the following info: username, password, list of customer account numbers and a date range. The user will then click a "command button" that will make the following happen:

  1. Open web based program, login (based on login/password typed into spreadsheet) and navigate to the account search screen.

  2. Enter first customer account number into search field and click the "search" button to navigate to the specific customer account.

  3. Navigate to the "search activity" screen, enter the date range and click the "search activity button.

  4. Pull the data from a specific column of the activity table and import the data to the spreadsheet.

  5. If there are multiple pages of data there will be a "Next Results" button, there should be a loop to click the next results button (if it exists) and pull the same column of data from each page until the button no longer exists (no more data).

  6. Once there are no more pages of data (or if there is only one page) the macro will loop back and navigate to the account search screen and perform the same operations for each account in the list of accounts typed into the spreadsheet until there are no other accounts.

  7. Once completed (all data successfully imported to the spreadsheet) it should close the IE window.

It's a little complicated and I realize excel/vba is definitely not the best solution for performing these functions but unfortunately it's what I have to use in this instance. I've been able to piece together some VBA that does almost everything above, the problem I'm having is looping through the activity pages and pulling the data just will not work (get a wide range of errors that only confuse me more), sometimes it will pull data from the first sheet, click the "next results" button, get to the next page and throw an error or even get through two or three pages and throw an error. It doesn't make a lot of sense but the most common error is "permission denied". Also this code currently only pulls the data from one account, I was hoping once I got it working for one account it would be simple to create a loop of the entire code to have it go down the list of account numbers and do the same for each until completed. I've been stuck on this for a number of weeks and I'm really ready to toss out the whole thing and start from scratch, any help would be very very appreciated!

Below is the code I have so far...

Private Sub CommandButton1_Click()

    ' open IE, navigate to the desired page and loop until fully loaded
    Set IE = New InternetExplorerMedium
    my_url = "https://customerinfo/pages/login.jsp"
    my_url2 = "https://customerinfo/pages/searchCustomer.jsp"
    my_url3 = "https://customerinfo/pages/searchAccountActivity.jsp"

    With IE
        .Visible = True
        .navigate my_url
        Do Until Not .Busy And .readyState = 4
            DoEvents
        Loop
    End With

    ' Input the userid and password
    IE.document.getElementById("userId").Value = [B2]
    IE.document.getElementById("password").Value = [B3]

    ' Click the "Login" button
    IE.document.getElementById("action").Click
    Do Until Not IE.Busy And IE.readyState = 4
        DoEvents
    Loop

    ' Navigate to Search screen
    With IE
        .navigate my_url2
        Do Until Not .Busy And .readyState = 4
            DoEvents
        Loop
    End With

    ' Input the account number & click search
    IE.document.getElementById("accountNumber").Value = [B5]
    IE.document.getElementById("action").Click
    Do Until Not IE.Busy And IE.readyState = 4
        DoEvents
    Loop

    With IE
        .navigate my_url3
        Do Until Not .Busy And .readyState = 4
            DoEvents
        Loop
    End With

    'Input search criteria
    IE.document.getElementById("store").Value = [C7]
    IE.document.getElementById("dateFromMonth").Value = [C10]
    IE.document.getElementById("dateFromDay").Value = [B11]
    IE.document.getElementById("dateFromYear").Value = [B12]
    IE.document.getElementById("timeFromHour").Value = [B20]
    IE.document.getElementById("timeFromMinute").Value = [B21]
    IE.document.getElementById("dateToMonth").Value = [C15]
    IE.document.getElementById("dateToDay").Value = [B16]
    IE.document.getElementById("dateToYear").Value = [B17]
    IE.document.getElementById("timeToHour").Value = [B24]
    IE.document.getElementById("timeToMinute").Value = [B25]
    IE.document.getElementById("action").Click
    Do Until Not IE.Busy And IE.readyState = 4
        DoEvents
    Loop

    'Pulls data from activity search
    Dim TDelements As IHTMLElementCollection
    Dim TDelement As HTMLTableCell
    Dim r As Long, i As Long
    Dim e As Object

    Application.Wait Now + TimeValue("00:00:05")
    Set TDelements = IE.document.getElementsByTagName("tr")
    r = 0
    For i = 1 To 1
        Application.Wait Now + TimeValue("00:00:03")
        For Each TDelement In TDelements
            If TDelement.className = "searchActivityResultsOldContent" Then
                Sheet1.Range("E1").Offset(r, 0).Value = TDelement.ChildNodes(8).innerText
                r = r + 1
            ElseIf TDelement.className = "searchActivityResultsNewContent" Then
                Sheet1.Range("E1").Offset(r, 0).Value = TDelement.ChildNodes(8).innerText
                r = r + 1
            End If
        Next
        Application.Wait Now + TimeValue("00:00:02")
        Set elems = IE.document.getElementsByTagName("input")
        For Each e In elems
            If e.Value = "Next Results" Then
                e.Click
                i = 0
                Exit For
            End If
        Next e
    Next i

    Do Until Not IE.Busy And IE.readyState = 4
      DoEvents
    Loop
    IE.Quit

End Sub

解决方案

So, what is happening after you've clicked on "Next..." element? Let me describe an issue I encountered. Assume the code flow as follows:

  1. Create IE instance, and navigate to some URL, e. g. first search results page.
  2. Make a check if the page is loaded and ready. Wait for it.
  3. Create the DispHTMLElementCollection collection of the target elements, retrieved by .document.getElementsByTagName(), etc..
  4. Loop through the elements of the collection, do some stuff.
  5. Click on the "Next ..." element. The issue is that in some cases the next page doesn't start downloading immediately after click due to some JS or XHR processing.
  6. Make a conventional check if the next page is loaded and ready. This check just allows the further code execution without any delay, since downloading of the next page has not been started immediately after click, and the current existing page is determined as next page downloaded and ready, by mistake. Simple several secs delays doesn't provide reliable way to get the ready page.
  7. Again, create the DispHTMLElementCollection collection of the elements from the existing page, instead of the next page, by mistake.
  8. Loop through the elements of the created collection. While the loop in progress, the next page starts downloading. The collection still contains the references to the objects, but actually the page with that objects has been unloaded. Thereby either attempt to access to the element of the unloaded page or due to document object is unresponsive, the operation gives "permission denied" errors.

My clue is to avoid clicking on "Next...", try to read the next page URL from .href property of the "Next..." anchor <a> element, and invoke IE.navigate to that URL, then check the page readiness.

Take a look at the example implementing that approach.

IMO the most efficient way is to use XHR, like this, this and this.

这篇关于使用VBA将特定的Web数据导入excel的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆