通过AJAX加载SPA网页 [英] Load a SPA webpage via AJAX

查看：186 发布时间：2019/6/7 18:27:51 javascript jquery ajax single-page-application jquery-load

本文介绍了通过AJAX加载SPA网页的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试通过插入URL来使用JavaScript获取整个网页。但是，该网站构建为单页应用程序（SPA），使用JavaScript / 或到 eval（）每个脚本标签的内容，但不是看起来足够强大，可以实际加载页面：

  jQuery.get（url，function（data） {
 var $ page = $（< div>）。html（data）
 $ page.find（script）。each（function（）{
 var scriptContent = $（this）.html（）; //获取此标记的内容
 eval（scriptContent）; //执行内容
}）; 
 console.log（％c✖ ：，color：red;，$ page.find（。page-title）。text（）。trim（））; 
 console.log（％c✔：，color：绿色;，$ page.find（footer .details）。tex 。T（）修剪（））; 
}）;

问：完全加载可以通过JavaScript报废的网页的任何选项？

解决方案

您永远无法完全复制任意（SPA）页面确实。

我看到的唯一方法就是使用无头浏览器，例如 PhantomJS 或无头Chrome 或无头火狐。

我想尝试无头Chrome，所以让我们看看它能对你的页面做些什么：

使用内部REPL进行快速检查

使用Chrome Headless加载该页面（在Mac / Linux上需要Chrome 59，在Windows上需要Chrome 60），并使用REPL中的JavaScript查找页面标题：

 ％chrome --headless --disable-gpu --repl https://connect.garmin.com/modern/activity/1915361012 
 [0830 / 171405.025582：INFO：headless_shell。 cc（303）]键入要评估的Javascript表达式或退出退出。 
>>> $（'body'）。find（'。page-title'）。text（）。trim（）
 {result：{type：string，value：每日英里 - 第2轮 - 第27天}}

注意：获得 chrome 在Mac上运行的命令行我事先这样做了：

  alias chrome ='/ Applications / Google Chrome .app / Contents / MacOS / Google Chrome'

以编程方式使用Node& Puppeteer

Puppeteer 是一个Node库（由Google Chrome开发人员提供），它提供了一个高级API，可通过DevTools协议控制无头Chrome。它也可以配置为使用完整（非无头）Chrome。

（步骤0：安装节点& 纱线如果你没有它们）

在新目录中：

  yarn init 
 yarn add puppeteer

创建 index.js with this：

  const puppeteer = require（'puppeteer'）; 
（async（）=> {
 const url ='https://connect.garmin.com/modern/activity/1915361012'; 
 const browser = await puppeteer.launch（） ; 
 const page = await browser.newPage（）; 
 //转到URL并等待页面加载
 await page.goto（url，{waitUntil：'networkidle'}）; 
 //等待结果显示
 await page.waitForSelector（'。page-title'）; 
 //从页面中提取结果
 const text = await page.evaluate（（）=> {
 const title = document.querySelector（'。page-title'）; 
 return title.innerText.trim（）; 
}）; 
 console.log（`Found：$ {text}`）; 
 browser.close（）; 
}）（）;

结果：

  $ node index.js 
找到：每日英里 - 第2轮 - 第27天

I'm trying to fetch an entire webpage using JavaScript by plugging in the URL. However, the website is built as a Single Page Application (SPA) that uses JavaScript / backbone.js to dynamically load most of it's contents after rendering the initial response.

So for example, when I route to the following address:

https://connect.garmin.com/modern/activity/1915361012

And then enter this into the console (after the page has loaded):

var $page = $("html")
console.log("%c✔: ", "color:green;", $page.find(".inline-edit-target.page-title-overflow").text().trim());
console.log("%c✔: ", "color:green;", $page.find("footer .details").text().trim());

Then I'll get the dynamically loaded activity title as well as the statically loaded page footer:

However, when I try to load the webpage via an AJAX call with either $.get() or .load(), I only get delivered the initial response (the same as the content when over view-source):

view-source:https://connect.garmin.com/modern/activity/1915361012

So if I use either of the the following AJAX calls:

// jQuery.get()
var url = "https://connect.garmin.com/modern/activity/1915361012";
jQuery.get(url,function(data) {
    var $page = $("<div>").html(data)
    console.log("%c✖: ", "color:red;",   $page.find(".page-title").text().trim());
    console.log("%c✔: ", "color:green;", $page.find("footer .details").text().trim());
});

// jQuery.load()
var url = "https://connect.garmin.com/modern/activity/1915361012";
var $page = $("<div>")
$page.load(url, function(data) {
    console.log("%c✖: ", "color:red;",   $page.find(".page-title").text().trim()    );
    console.log("%c✔: ", "color:green;", $page.find("footer .details").text().trim());
});

I'll still get the initial footer, but won't get any of the other page contents:

I've tried the solution here to eval() the contents of every script tag, but that doesn't appear robust enough to actually load the page:

jQuery.get(url,function(data) {
    var $page = $("<div>").html(data)
    $page.find("script").each(function() {
        var scriptContent = $(this).html(); //Grab the content of this tag
        eval(scriptContent); //Execute the content
    });
    console.log("%c✖: ", "color:red;",   $page.find(".page-title").text().trim());
    console.log("%c✔: ", "color:green;", $page.find("footer .details").text().trim());
});

Q: Any options to fully load a webpage that will scrapable over JavaScript?

解决方案

You will never be able to fully replicate by yourself what an arbitrary (SPA) page does.

The only way I see is using a headless browser such as PhantomJS or Headless Chrome, or Headless Firefox.

I wanted to try Headless Chrome so let's see what it can do with your page:

Quick check using internal REPL

Load that page with Chrome Headless (you'll need Chrome 59 on Mac/Linux, Chrome 60 on Windows), and find page title with JavaScript from the REPL:

% chrome --headless --disable-gpu --repl https://connect.garmin.com/modern/activity/1915361012
[0830/171405.025582:INFO:headless_shell.cc(303)] Type a Javascript expression to evaluate or "quit" to exit.
>>> $('body').find('.page-title').text().trim() 
{"result":{"type":"string","value":"Daily Mile - Round 2 - Day 27"}}

NB: to get chrome command line working on a Mac I did this beforehand:

alias chrome="'/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'"

Using programmatically with Node & Puppeteer

Puppeteer is a Node library (by Google Chrome developers) which provides a high-level API to control headless Chrome over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome.

(Step 0 : Install Node & Yarn if you don't have them)

In a new directory:

yarn init
yarn add puppeteer

Create index.js with this:

const puppeteer = require('puppeteer');
(async() => {
    const url = 'https://connect.garmin.com/modern/activity/1915361012';
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    // Go to URL and wait for page to load
    await page.goto(url, {waitUntil: 'networkidle'});
    // Wait for the results to show up
    await page.waitForSelector('.page-title');
    // Extract the results from the page
    const text = await page.evaluate(() => {
        const title = document.querySelector('.page-title');
        return title.innerText.trim();
    });
    console.log(`Found: ${text}`);
    browser.close();
})();

Result:

$ node index.js 
Found: Daily Mile - Round 2 - Day 27

这篇关于通过AJAX加载SPA网页的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

通过AJAX加载SPA网页 [英] Load a SPA webpage via AJAX

问题描述

使用内部REPL进行快速检查

以编程方式使用Node& Puppeteer

Quick check using internal REPL

Using programmatically with Node & Puppeteer

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

通过AJAX加载SPA网页 [英] Load a SPA webpage via AJAX

问题描述

使用内部REPL进行快速检查

以编程方式使用Node& Puppeteer

Quick check using internal REPL

Using programmatically with Node & Puppeteer

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭