如果搜索页面网址时出错,请继续处理 [英] Continue processing if there is an error searching for page url

查看:60
本文介绍了如果搜索页面网址时出错,请继续处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大约20,000名演员的清单,以检查其中有德文版的Wikipedia页面.

I have a list of approximately 20,000 actors to check which has a German Wikipedia page.

我找到了一个代码,您可以使用该代码通过Google搜索网址,并将第一个结果复制到Excel中.

I found a code with which you can search for urls via Google and get the first result copied into Excel.
Using VBA in Excel to Google Search in IE and return the hyperlink of the first result

我试图通过让Google仅搜索德语页面来将搜索限制为德语Wikipedia.例如." site:de.wikipedia.org intitle:johnny depp"

I tried to restrict the search to the German Wikipedia by having Google search for German pages only. E.g. "site:de.wikipedia.org intitle:johnny depp"

这适用于已知演员.

搜索没有自己页面的演员时出现错误.

I get an error when I search for an actor that does not have his own page.

错误91:对象变量或未设置块变量"

"Error 91: Object variable or with block variable not set"

如果演员没有自己的页面,而是跳过列表中的下一个页面,我如何构建一个跳过该演员的变通方法?

How can I build a work-around that skips the actor when he/she has no own page and instead continues with the next in the list?

或者也许您有一个更简单的解决方案.

Or maybe you have a simpler solution.

示例文件

Sub XMLHTTP()
    Dim url As String, lastRow As Long
    Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
    Dim start_time As Date
    Dim end_time As Date

    lastRow = Range("A" & Rows.Count).End(xlUp).Row
    
    Dim cookie As String
    Dim result_cookie As String
    
    start_time = Time
    Debug.Print "start_time:" & start_time

    For i = 2 To lastRow

        url = "https://www.google.de/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)

        Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
        XMLHTTP.Open "GET", url, False
        XMLHTTP.setRequestHeader "Content-Type", "text/xml"
        XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
        XMLHTTP.send

        Set html = CreateObject("htmlfile")
        html.body.innerHTML = XMLHTTP.ResponseText
        Set objResultDiv = html.getelementbyid("rso")
        Set objH3 = objResultDiv.getelementsbytagname("H3")(0)
        Set link = objH3.getelementsbytagname("a")(0)

        str_text = Replace(link.innerHTML, "<EM>", "")
        str_text = Replace(str_text, "</EM>", "")

        Cells(i, 2) = str_text
        Cells(i, 3) = link.href
        DoEvents
    Next
    
    end_time = Time
    Debug.Print "end_time:" & end_time
    
    Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
    MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub

推荐答案

检查是否找到 objResultDiv 元素,如果找到,请继续进行操作,否则将"Not Found"写入单元格.

Check if objResultDiv element is found and if it is found, proceed further else write "Not Found" to the cells.

您可以尝试这样的事情...

You may try something like this...

Sub XMLHTTP()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date
Dim i As Long
Dim str_text As String

lastRow = Range("A" & Rows.Count).End(xlUp).Row

Dim cookie As String
Dim result_cookie As String

start_time = Time
Debug.Print "start_time:" & start_time

For i = 2 To lastRow

    url = "https://www.google.de/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)

    Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
    XMLHTTP.Open "GET", url, False
    XMLHTTP.setRequestHeader "Content-Type", "text/xml"
    XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
    XMLHTTP.send

    If XMLHTTP.Status = 200 Then
        Set html = CreateObject("htmlfile")
        html.body.innerHTML = XMLHTTP.ResponseText
        Set objResultDiv = html.getelementbyid("rso")

        If Not objResultDiv Is Nothing Then
            Set objH3 = objResultDiv.getelementsbytagname("H3")(0)
            Set link = objH3.getelementsbytagname("a")(0)


            str_text = Replace(link.innerHTML, "<EM>", "")
            str_text = Replace(str_text, "</EM>", "")

            Cells(i, 2) = str_text
            Cells(i, 3) = link.href
            DoEvents
        Else
            Cells(i, 2) = "Not Found"
            Cells(i, 3) = "Not Found"
        End If
    Else
        Cells(i, 2) = "Not Found"
        Cells(i, 3) = "Not Found"
    End If
Next

end_time = Time
Debug.Print "end_time:" & end_time

Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub

这篇关于如果搜索页面网址时出错,请继续处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆