使用Apps脚本抓取javascript呈现的网页 [英] Using Apps Script to scrape javascript rendered web page
问题描述
我正在努力将一个脚本放在一起,以处理通过Apps脚本抓取JavaScript呈现的网页的过程.在这里找到这个如何使用Javascript抓取Javascript呈现的网站? ,但我不知道如何将它们放在一起.如负载p.任何帮助将不胜感激.
您可以尝试抓取初始HTML,因为实际上抓取呈现的HTML非常困难,因此您必须使用无头浏览器.>
有这个库: https://github.com/tautologistics/node-htmlparser可以用来从JavaScript解析HTML,它位于节点中,但是由于它不使用任何依赖关系,因此您只需复制并粘贴所需的函数即可.
恐怕这不是一件容易的事.
I am struggling to put a script together to handle the scraping of a javascript rendered web page through Apps Script. Found this How to scrape Javascript rendered websites using Javascript? here, but I don't know how to put this together. Such as load puppeteer. Any help would be appreciated.
You can try to scrape the initial HTML, since actually scraping the rendered HTML is extremely hard to do, you'd have to use a headless browser.
There is this library: https://github.com/tautologistics/node-htmlparser which you can use to parse HTML from JavaScript, it is in node, but because it doesn't use any dependencies, you can just copy and paste the functions you need.
Parsing it's not a very easy task I'm afraid.
这篇关于使用Apps脚本抓取javascript呈现的网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!