当浏览器正常启动时,浏览器是否可以无中断执行,反之亦然? [英] Can the browser turned headless mid-execution when it was started normally, or vice-versa?
问题描述
我想立即启动一个Chrome浏览器无头,做一些自动化操作,然后在完成剩下的工作之前将其打开。
I want to start a chromium browser instant headless, do some automated operations, and then turn it visible before doing the rest of the stuff.
这是否可以使用Puppeteer,如果可以,你能告诉我怎么做?如果不是,那么浏览器自动化是否还有其他框架或库可以做到这一点?
Is this possible to do using Puppeteer, and if it is, can you tell me how? And if it is not, is there any other framework or library for browser automation that can do this?
到目前为止,我已经尝试了以下但是它不起作用。
So far I've tried the following but it didn't work.
const browser = await puppeteer.launch({'headless': false});
browser.headless = true;
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
await page.pdf({path: 'hn.pdf', format: 'A4'});
推荐答案
简答:这是不可能的
Chrome只允许以无头或非无头模式启动浏览器。您必须在启动浏览器时指定它,并且在运行时无法切换。
Short answer: It's not possible
Chrome only allows to either start the browser in headless or non-headless mode. You have to specify it when you launch the browser and it is not possible to switch during runtime.
可能的是启动第二个浏览器并重复使用cookie(和来自第一个浏览器的任何其他数据。
What is possible, is to launch a second browser and reuse cookies (and any other data) from the first browser.
你会认为你可以重用调用 puppeteer.launch <时的数据目录/ code>
,但由于存在多个错误,目前无法实现此目的( #1268 ,在木偶回购中#1270 。
You would assume that you could just reuse the data directory when calling puppeteer.launch
, but this is currently not possible due to multiple bugs (#1268, #1270 in the puppeteer repo).
所以最好的方法是保存任何cookie或本地存储数据您需要在浏览器实例之间共享并在启动浏览器时还原数据。然后,您第二次访问该网站。请注意,当您重新抓取页面时,网站在JavaScript变量方面的任何状态都将丢失。
So the best approach is to save any cookies or local storage data that you need to share between the browser instances and restore the data when you launch the browser. You then visit the website a second time. Be aware that any state the website has in terms of JavaScript variable, will be lost when you recrawl the page.
总结一下,整个过程应该是这样的(反之亦然,无头到头):
Summing up, the whole process should look like this (or vice versa for headless to headfull):
- 非抓取无头模式,直到你想要切换模式
- 序列化cookie
- 启动或重复使用第二个浏览器(无头模式)
- 恢复cookie
- 重新访问页面
- 继续抓取
- Crawl in non-headless mode until you want to switch mode
- Serialize cookies
- Launch or reuse second browser (in headless mode)
- Restore cookies
- Revisit page
- Continue crawling
这篇关于当浏览器正常启动时,浏览器是否可以无中断执行,反之亦然?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!