VBA宏可从IE中的链接下载多个文件 [英] VBA Macro to download multiple files from links in IE

查看:134
本文介绍了VBA宏可从IE中的链接下载多个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从链接列表中下载多个文件.我找到链接的网站受到保护.这就是为什么我要使用IE(使用当前会话/cookie)的原因.每个链接的目标是一个xml文件.文件太大,无法打开然后保存.因此,我需要直接保存它们(单击鼠标右键,将目标另存为).

I want to download multiple files from a list of links. The website where I find the links is protected. This is why I want to use IE (using the current session/cookie). The target of each link is a xml file. The files are too large to open and then save. So I need to save them directly (right-click, save target as).

链接列表如下:

<html>
<body>
<p> <a href="https://example.com/report?_hhhh=XML"Link A</a><br>> </p>
<p> <a href="https://example.com/report?_aaaa=XML"Link B</a><br>> </p>
...
</body>
</html>

我想遍历所有链接并保存每个目标.目前,另存为"存在问题.我真的不知道该怎么做.到目前为止,这是我的代码:

I want to loop through all links and save each target. Currently I have problems with the "Save As". I don't really know how to do it. This is my code so far:

Sub DownloadAllLinks()

Dim IE As Object
Dim Document As Object
Dim List As Object
Dim Link As Object

' Before I logged in to the website
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.Navigate ("https:\\......\links.html")

Do While IE.Busy
  DoEvents
Loop

' Detect all links on website
Set Document = IE.Document
Set List = Document.getElementsByTagName("a")

' Loop through all links to download them

For Each Link In List

' Now I need to automate "save target as" / right-click and then "save as"
...

Next Link
End Sub

您是否有任何想法可以自动为每个链接另存为"?

Do you have any ideas to automate "Save As" for each link?

感谢您的帮助.非常感谢, 乌里

Any help is appreciated. Many thanks, Uli

推荐答案

下面是我针对您的情况改编的一个非常常见的示例,它显示了XHR和RegEx用来检索网页HTML内容,从中提取所有链接以及下载每个链接的目标文件:

Below is a quite common example I adapted for your case, it shows the usage of XHR and RegEx to retrieve webpage HTML content, extract all links from it, and download each link's target file:

Option Explicit

Sub Test()
    ' declare vars
    Dim sUrl As String
    Dim sReqProt As String
    Dim sReqAddr As String
    Dim sReqPath As String
    Dim sContent As String
    Dim oLinks As Object
    Dim oMatch As Object
    Dim sHref As String
    Dim sHrefProt As String
    Dim sHrefAddr As String
    Dim sHrefPath As String
    Dim sHrefFull As String
    Dim n As Long
    Dim aContent() As Byte
    ' set source URL
    sUrl = "https:\\......\links.html"
    ' process source URL
    SplitUrl sUrl, sReqProt, sReqAddr, sReqPath
    If sReqProt = "" Then sReqProt = "http:"
    sUrl = sReqProt & "//" & sReqAddr & "/" & sReqPath
    ' retrieve source page HTML content
    With CreateObject("Microsoft.XMLHTTP")
        .Open "GET", sUrl, False
        .Send
        sContent = .ResponseText
    End With
    ' parse source page HTML content to extract all links
    Set oLinks = CreateObject("Scripting.Dictionary")
    With CreateObject("VBScript.RegExp")
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = "<a.*?href *= *(?:'|"")(.*?)(?:'|"").*?>"
        For Each oMatch In .Execute(sContent)
            sHref = oMatch.subMatches(0)
            SplitUrl sHref, sHrefProt, sHrefAddr, sHrefPath
            If sHrefProt = "" Then sHrefProt = sReqProt
            If sHrefAddr = "" Then sHrefAddr = sReqAddr
            sHrefFull = sHrefProt & "//" & sHrefAddr & "/" & sHrefPath
            oLinks(oLinks.Count) = sHrefFull
        Next
    End With
    ' save each link target into file
    For Each n In oLinks
        sHref = oLinks(n)
        With CreateObject("Microsoft.XMLHTTP")
            .Open "GET", sHref, False
            .Send
            aContent = .ResponseBody
        End With
        With CreateObject("ADODB.Stream")
            .Type = 1 ' adTypeBinary
            .Open
            .Write aContent
            .SaveToFile "C:\Test\" & n & ".xml", 2 ' adSaveCreateOverWrite
            .Close
        End With
    Next
End Sub

Sub SplitUrl(sUrl, sProt, sAddr, sPath)
    ' extract protocol, address and path from URL
    Dim aSplit
    aSplit = Split(sUrl, "//")
    If UBound(aSplit) = 0 Then
        sProt = ""
        sAddr = sUrl
    Else
        sProt = aSplit(0)
        sAddr = aSplit(1)
    End If
    aSplit = Split(sAddr, "/")
    If UBound(aSplit) = 0 Then
        sPath = sAddr
        sAddr = ""
    Else
        sPath = Mid(sAddr, Len(aSplit(0)) + 2)
        sAddr = aSplit(0)
    End If
End Sub

此方法未使用IE自动化.通常,Microsoft.XMLHTTP处理的IE cookie足以引用当前会话,因此,如果您的网站不使用其他过程进行身份验证和生成链接列表,则该方法将为您工作.

This method doesn't employ IE automation. Usually the IE's cookies which Microsoft.XMLHTTP processes are sufficient to refer to the current session, so if your website doesn't use additional procedures for authentication and generation the list of the links then the method should work for you.

这篇关于VBA宏可从IE中的链接下载多个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆