当浏览器正常启动时,浏览器是否可以无中断执行,反之亦然? [英] Can the browser turned headless mid-execution when it was started normally, or vice-versa?

查看:129
本文介绍了当浏览器正常启动时,浏览器是否可以无中断执行,反之亦然?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想立即启动一个Chrome浏览器无头,做一些自动化操作,然后在完成剩下的工作之前将其打开。

I want to start a chromium browser instant headless, do some automated operations, and then turn it visible before doing the rest of the stuff.

这是否可以使用Puppeteer,如果可以,你能告诉我怎么做?如果不是,那么浏览器自动化是否还有其他框架或库可以做到这一点?

Is this possible to do using Puppeteer, and if it is, can you tell me how? And if it is not, is there any other framework or library for browser automation that can do this?

到目前为止,我已经尝试了以下但是它不起作用。

So far I've tried the following but it didn't work.

const browser = await puppeteer.launch({'headless': false});
browser.headless = true;
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
await page.pdf({path: 'hn.pdf', format: 'A4'});


推荐答案

简答:这是不可能的



Chrome只允许以无头非无头模式启动浏览器。您必须在启动浏览器时指定它,并且在运行时无法切换。

Short answer: It's not possible

Chrome only allows to either start the browser in headless or non-headless mode. You have to specify it when you launch the browser and it is not possible to switch during runtime.

可能的是启动第二个浏览器并重复使用cookie(和来自第一个浏览器的任何其他数据。

What is possible, is to launch a second browser and reuse cookies (and any other data) from the first browser.

你会认为你可以重用调用 puppeteer.launch <时的数据目录/ code> ,但由于存在多个错误,目前无法实现此目的( #1268 ,在木偶回购中#1270

You would assume that you could just reuse the data directory when calling puppeteer.launch, but this is currently not possible due to multiple bugs (#1268, #1270 in the puppeteer repo).

所以最好的方法是保存任何cookie或本地存储数据您需要在浏览器实例之间共享并在启动浏览器时还原数据。然后,您第二次访问该网站。请注意,当您重新抓取页面时,网站在JavaScript变量方面的任何状态都将丢失。

So the best approach is to save any cookies or local storage data that you need to share between the browser instances and restore the data when you launch the browser. You then visit the website a second time. Be aware that any state the website has in terms of JavaScript variable, will be lost when you recrawl the page.

总结一下,整个过程应该是这样的(反之亦然,无头到头):

Summing up, the whole process should look like this (or vice versa for headless to headfull):

  • Crawl in non-headless mode until you want to switch mode
  • Serialize cookies
  • Launch or reuse second browser (in headless mode)
  • Restore cookies
  • Revisit page
  • Continue crawling

这篇关于当浏览器正常启动时,浏览器是否可以无中断执行,反之亦然?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆