“渴望"Python 中 Chromedriver Selenium 的页面加载策略解决方法 [英] "Eager" Page Load Strategy workaround for Chromedriver Selenium in Python

查看:45
本文介绍了“渴望"Python 中 Chromedriver Selenium 的页面加载策略解决方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想加快 selenium 页面的加载时间,因为我只需要 HTML(我正在尝试使用 BeautifulSoup 抓取所有链接).使用 PageLoadStrategy.NONE 无法抓取所有链接,Chrome 不再支持 PageLoadStrategy.EAGER.有谁知道在 python 中获取 PageLoadStrategy.EAGER 的解决方法?

解决方案

ChromeDriver独立服务器,它实现了

但是,如果您观察 ChromeDriver 的当前实现,Chrome DevTools 确实考虑了以下document.readyStates:

  • document.readyState == '完成'
  • document.readyState == 'interactive'

这是一个示例相关日志:

[1517231304.270][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=11) {"表达式": "var isLoaded = document.readyState == 'complete' || document.readyState == 'interactive';if (isLoaded) { var frame = document.createElement('iframe'); frame.name = 'chromedriver虚拟框架'; ..."}

根据 WebDriver Status,您会发现所有 WebDriver 命令 的列表及其在 ChromeDriver 中的当前支持(基于 WebDriver 规范.从各个方面完成后,PageLoadStrategy.EAGER 必然会在功能上存在于 Chrome 驱动程序中.

I want to speed up the loading time for pages on selenium because I don't need anything more than the HTML (I am trying to scrape all the links using BeautifulSoup). Using PageLoadStrategy.NONE doesn't work to scrape all the links, and Chrome no longer supports PageLoadStrategy.EAGER. Does anyone know of a workaround to get PageLoadStrategy.EAGER in python?

解决方案

ChromeDriver is the standalone server which implements WebDriver's wire protocol for Chromium. Chrome and Chromium are still in the process of implementing and moving to the W3C standard. Currently ChromeDriver is available for Chrome on Android and Chrome on Desktop (Mac, Linux, Windows and ChromeOS).

As per the current WebDriver W3C Editor's Draft The following is the table of page load strategies that links the pageLoadStrategy capability keyword to a page loading strategy state, and shows which document readiness state that corresponds to it:

However, if you observe the current implementation of of ChromeDriver, the Chrome DevTools does takes into account the following document.readyStates:

  • document.readyState == 'complete'
  • document.readyState == 'interactive'

Here is a sample relevant log:

[1517231304.270][DEBUG]: DEVTOOLS COMMAND Runtime.evaluate (id=11) {
   "expression": "var isLoaded = document.readyState == 'complete' ||    document.readyState == 'interactive';if (isLoaded) {  var frame = document.createElement('iframe');  frame.name = 'chromedriver dummy frame'; ..."
}

As per WebDriver Status you will find the list of all WebDriver commands and their current support in ChromeDriver based on what is in the WebDriver Specification. Once the implementation are completed from all aspects PageLoadStrategy.EAGER is bound to be functionally present within Chrome Driver.

这篇关于“渴望"Python 中 Chromedriver Selenium 的页面加载策略解决方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆