使用XMLHTTP方法时,请等待页面加载 [英] Wait until page is loaded when using XMLHTTP approach

查看:84
本文介绍了使用XMLHTTP方法时,请等待页面加载的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在以下工作代码中,我尝试导航到特定的YouTube频道要将视频名称导入excel ..它正在工作,但部分是因为代码仅列出了大约30个视频

 昏暗x,html作为对象,ele作为对象,sKeyWords作为字符串,i长使用CreateObject("MSXML2.ServerXMLHTTP").打开"GET","YouTube频道网址视频","False".setRequestHeader"Content-Type","application/x-www-form-urlencoded".发送如果.Status<>200然后MsgBox问题"和vbNewLine&状态和-"&.statusText:退出子设置html = CreateObject("htmlfile")html.body.innerHTML = .responseText 

我如何才能使代码加载页面..的所有内容?以便获取此处列出的所有视频.

我找到了一个网站,该网站列出了一个表中的所有视频,但是对于抓取该表的部分,我无法提取视频名称甚至无法处理该表这是我的尝试

  Sub Post_Method()昏暗的HTTP作为新的XMLHTTP60变暗html作为新的HTMLDocument昏暗的htmla作为对象暗淡作为对象作为对象的暗淡Dim strArg作为字符串昏暗的c只要昏暗x长strArg ="inputType = 1& stringInput = https%3A%2F%2Fwww.youtube.com%2Fchannel%2FUC43lrLHl4EhxaKQn2HrcJPQ& limit = 100& keyType = default& customKey ="用http.打开"POST","https://youtube-playlist-analyzer.appspot.com/submit",False.setRequestHeader内容类型",应用程序/x-www-form-urlencoded".send strArghtml.body.innerHTML = .responseText'WriteTxtFile html.body.innerHTML结束于昏暗的帖子作为对象,元素作为对象,r尽可能长``这部分我无法调整'------------------------设置帖子= html.getElementById("container").getElementById("tableContainer").getElementById("tableData")对于帖子中的每个元素对于elem.Cells中的每个拖曳c = c + 1:单元格(r + 1,c)= trow.innerText下一拖c = 0:r = r + 1下一个元素'----------------------------------停止结束子 

解决方案

您可以使用该端点,然后从包含感兴趣数据的响应中提取javascript对象,并使用jsonconverter.bas进行解析.

Json库:

我使用jsonconverter.bas.从


Py:

python更加简洁

 导入请求,re,json,csv数据= {'inputType':'1','stringInput':'https://www.youtube.com/channel/UC43lrLHl4EhxaKQn2HrcJPQ','限制':'100','keyType':'默认'}r = request.post('https://youtube-playlist-analyzer.appspot.com/submit',data=data)p = re.compile(r'json_items =(.*?);',re.DOTALL)结果= json.loads(p.findall(r.text)[0])使用open("data.csv","w",encoding ="utf-8-sig",newline ='')作为csv_file:w = csv.writer(csv_file,delimiter =,",quoting = csv.QUOTE_MINIMAL)#针对语言环境更改此设置w.writerow(['Title','ViewCount'])对于结果中的项目:w.writerow([[item ['title'],item ['viewCount']]) 

In the following working code, I am trying to navigate to specific youtube channel To get the videos names into excel .. It is working but partially as the code just lists about 30 videos only

    Dim x, html As Object, ele As Object, sKeyWords As String, i As Long

With CreateObject("MSXML2.ServerXMLHTTP")
    .Open "GET", "youtube channel url videos", False
    .setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
    .send

    If .Status <> 200 Then MsgBox "Problem" & vbNewLine & .Status & " - " & .statusText: Exit Sub

    Set html = CreateObject("htmlfile")
    html.body.innerHTML = .responseText

How can I maje the code load all the content of the page ..? so as to get all the videos that are listed there.

I have found a site that lists all the videos in one table but as for the part of scraping the table, I failed to extract the video name or even dealing with the table Here's my try

Sub Post_Method()
Dim http        As New XMLHTTP60
Dim html        As New HTMLDocument
Dim htmla       As Object
Dim trow        As Object
Dim tcel        As Object
Dim strArg      As String
Dim c           As Long
Dim x           As Long

strArg = "inputType=1&stringInput=https%3A%2F%2Fwww.youtube.com%2Fchannel%2FUC43lrLHl4EhxaKQn2HrcJPQ&limit=100&keyType=default&customKey="

With http
    .Open "POST", "https://youtube-playlist-analyzer.appspot.com/submit", False
    .setRequestHeader "Content-type", "application/x-www-form-urlencoded"
    .send strArg
    html.body.innerHTML = .responseText
   ' WriteTxtFile html.body.innerHTML
End With

Dim posts As Object, elem As Object, r As Long

'This part I can't adjust
'------------------------
    Set posts = html.getElementById("container").getElementById("tableContainer").getElementById("tableData")

    For Each elem In posts.Children
        For Each trow In elem.Cells
            c = c + 1: Cells(r + 1, c) = trow.innerText
        Next trow
        c = 0: r = r + 1
    Next elem
    '----------------------------------
 Stop

End Sub

解决方案

You can use that endpoint then extract the javascript object from the response which contains data of interest and parse with jsonconverter.bas.

Json library:

I use jsonconverter.bas. Download raw code from here and add to standard module called JsonConverter . You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime. Remove the top Attribute line from the copied code.


VBA:

Option Explicit

Public Sub GetYouTubeViews()

    Dim s As String, ws As Worksheet, body As String

    body = "inputType=1&stringInput=https://www.youtube.com/channel/UC43lrLHl4EhxaKQn2HrcJPQ&limit=100&keyType=default"

    Set ws = ThisWorkbook.Worksheets("Sheet1")

    With CreateObject("MSXML2.XMLHTTP")
        .Open "POST", "https://youtube-playlist-analyzer.appspot.com/submit", False
        .setRequestHeader "User-Agent", "Mozilla/5.0"
        .send body
        s = .responseText
    End With

    Dim results(), r As Long, jsonSource As String
    Dim json As Object, item As Object, headers()

    jsonSource = GetString(s, "json_items = ", ";")

    If jsonSource = "No match" Then Exit Sub

    Set json = JsonConverter.ParseJson(jsonSource)

    headers = Array("Title", "ViewCount")

    ReDim results(1 To json.Count, 1 To UBound(headers) + 1)

    For Each item In json
        r = r + 1
        results(r, 1) = item("title")
        results(r, 2) = item("viewCount")
    Next

    With ws
        .Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
        .Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
    End With
End Sub

Public Function GetString(ByVal inputString As String, ByVal startPhrase As String, ByVal endPhrase As String) As String
    Dim s As Long, e As Long

    s = InStr(inputString, startPhrase)
    If Not s > 0 Then
        GetString = "No match"
        Exit Function
    End If

    e = InStr(s + Len(startPhrase) - 1, inputString, endPhrase)

    If Not e > 0 Then
        GetString = "No match"
        Exit Function
    End If
    GetString = Mid$(inputString, s + Len(startPhrase), e - (s + Len(startPhrase)))
End Function


Sample results:


Py:

A lot more concise with python

import requests, re, json ,csv

data = {
  'inputType': '1',
  'stringInput': 'https://www.youtube.com/channel/UC43lrLHl4EhxaKQn2HrcJPQ',
  'limit': '100',
  'keyType': 'default'
}

r = requests.post('https://youtube-playlist-analyzer.appspot.com/submit',  data=data)
p = re.compile(r'json_items = (.*?);', re.DOTALL)
results = json.loads(p.findall(r.text)[0])

with open("data.csv", "w", encoding="utf-8-sig", newline='') as csv_file:
    w = csv.writer(csv_file, delimiter = ",", quoting=csv.QUOTE_MINIMAL) #change this for locale
    w.writerow(['Title','ViewCount'])
    for item in results:
        w.writerow([item['title'], item['viewCount']])

这篇关于使用XMLHTTP方法时,请等待页面加载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆