从excel中当前打开的网站中收集数据并删除最后的搜索结果excel vba [英] Scraping data from currently opened website in excel and delete last search result excel vba

查看:32
本文介绍了从excel中当前打开的网站中收集数据并删除最后的搜索结果excel vba的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从www.bizi.si网页上抓取公司数据.一切正常,但是当我在BIZI网站上更换公司时,我得到的搜索结果与先前搜索的结果相同(REPROMAT d.o.o.和地址),但我希望得到结果(CERJAK d.o.o.和地址).我必须关闭并打开excel才能提取不同的公司数据.我希望我可以在不关闭excel文件的情况下从其他公司抓取数据.谢谢.

I am trying to scrape company data from web page www.bizi.si. It is working ok, but when I change company in web site BIZI I get in excel the same result of a previous search (REPROMAT d.o.o. and address), but I want to be result (CERJAK d.o.o. and address). I must close and open excel to extract different company data. I wish I could scrape data from different companies without closing excel file. Thank you.

Sub CompanyData()

Sub CompanyData()

将html视作HTMLDocument,将ws作为工作表,将节点作为Object

Dim html As HTMLDocument, ws As Worksheet, nodes As Object

Set ws = ThisWorkbook.Worksheets("NAROČILO")
Set html = New HTMLDocument

With CreateObject("MSXML2.XMLHTTP")
    .Open "GET", "https://www.bizi.si/iskanje?q=" & Application.EncodeURL(ws.Range("A1").Value), False
    .send
    html.body.innerHTML = .responseText

    Set nodes = html.querySelectorAll("td.item")

    With ws
        .Range("A4").Value = nodes.Item(0).FirstChild.innerText
        .Range("A5").Value = nodes.Item(1).innerText
        .Range("B6").Value = nodes.Item(3).innerText
    End With

    .Open "GET", html.querySelector("[id$=linkCompany]").href, False
    .send
    html.body.innerHTML = .responseText
    ws.Range("A3") = html.querySelector("#ctl00_ctl00_cphMain_cphMainCol_CompanySPLPreview1_labTitlePRS").innerText
End With

结束子

推荐答案

问题是-缺少句子.setRequestHeader"If-Modified-Since","Sat,1 Jan 2000 00:00:00 GMT"

Problem was - missing sentence .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"

子CompanyData()昏暗的html作为HTMLDocument,ws作为工作表,节点作为Object

Sub CompanyData() Dim html As HTMLDocument, ws As Worksheet, nodes As Object

Set ws = ThisWorkbook.Worksheets("NAROČILO")
Set html = New HTMLDocument

With CreateObject("MSXML2.XMLHTTP")
    .Open "GET", "https://www.bizi.si/iskanje?q=" & Application.EncodeURL(ws.Range("A1").Value), False
    .setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
    .send
    html.body.innerHTML = .responseText

    Set nodes = html.querySelectorAll("td.item")

    With ws
        .Range("A4").Value = nodes.Item(0).FirstChild.innerText
        .Range("A5").Value = nodes.Item(1).innerText
        .Range("B6").Value = nodes.Item(3).innerText
    End With

    .Open "GET", html.querySelector("[id$=linkCompany]").href, False
    .send
    html.body.innerHTML = .responseText
    ws.Range("A3") = html.querySelector("#ctl00_ctl00_cphMain_cphMainCol_CompanySPLPreview1_labTitlePRS").innerText
End With

结束子

这篇关于从excel中当前打开的网站中收集数据并删除最后的搜索结果excel vba的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆