不要等待页面在Python中使用Selenium加载 [英] Don't wait for a page to load using Selenium in Python

查看:268
本文介绍了不要等待页面在Python中使用Selenium加载的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在页面完全加载之前让硒单击元素并刮取数据?我的互联网连接非常糟糕,因此有时有时需要花很长时间才能完全加载页面,反正还是这样吗?

How do I make selenium click on elements and scrape data before the page has fully loaded? My internet connection is quite terrible so it sometimes takes forever to load the page entirely, is there anyway around this?

推荐答案

ChromeDriver 77.0 (支持Chrome版本77)现在支持 eager 作为 pageLoadStrategy .

ChromeDriver 77.0 (which supports Chrome version 77) now supports eager as pageLoadStrategy.

已解决的问题1902:支持优先页面加载策略[Pri-2]

Resolved issue 1902: Support eager page load strategy [Pri-2]


在这种情况下,当您对提到click on elements and scrape data before the page has fully loaded的问题提出疑问时,我们可以使用属性 pageLoadStrategy 的帮助.当Selenium默认加载页面/URL时,它将遵循默认配置,其中 pageLoadStrategy 设置为 normal . Selenium可以从不同的 Document readiness state 开始执行下一行代码.目前,Selenium支持3种不同的 Document readiness state ,我们可以通过 pageLoadStrategy 对其进行配置,如下所示:


As you question mentions of click on elements and scrape data before the page has fully loaded in this case we can take help of an attribute pageLoadStrategy. When Selenium loads a page/url by default it follows a default configuration with pageLoadStrategy set to normal. Selenium can start executing the next line of code from different Document readiness state. Currently Selenium supports 3 different Document readiness state which we can configure through the pageLoadStrategy as follows:

  1. none (未定义)
  2. eager (页面变为交互式)
  3. normal (完整页面加载)

以下是配置 pageLoadStrategy 的代码块:

Here is the code block to configure the pageLoadStrategy:

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

binary = r'C:\Program Files\Mozilla Firefox\firefox.exe'
caps = DesiredCapabilities().FIREFOX
# caps["pageLoadStrategy"] = "normal"  #  complete
caps["pageLoadStrategy"] = "eager"  #  interactive
# caps["pageLoadStrategy"] = "none"   #  undefined
driver = webdriver.Firefox(capabilities=caps, firefox_binary=binary, executable_path="C:\\Utility\\BrowserDrivers\\geckodriver.exe")
driver.get("https://google.com")

这篇关于不要等待页面在Python中使用Selenium加载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆