如何让Selenium不要等到整个页面加载后脚本运行缓慢? [英] How to make Selenium not wait till full page load, which has a slow script?

查看:684
本文介绍了如何让Selenium不要等到整个页面加载后脚本运行缓慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

driver.get (url)等到整个页面加载完毕.但是,抓取页面尝试加载一些无效的JS脚本.因此,我的Python脚本正在等待它,并且几分钟后无法正常工作.这个问题可能出现在网站的每个页面上.

Selenium driver.get (url) wait till full page load. But a scraping page try to load some dead JS script. So my Python script wait for it and doesn't works few minutes. This problem can be on every pages of a site.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://www.cortinadecor.com/productos/17/estores-enrollables-screen/estores-screen-corti-3000')
# It try load: https://www.cetelem.es/eCommerceCalculadora/resources/js/eCalculadoraCetelemCombo.js 
driver.find_element_by_name('ANCHO').send_keys("100")

如何限制等待时间,阻止文件的AJAX加载,还是其他方式?

How to limit the time wait, block AJAX load of a file, or is other way?

我也可以在webdriver.Chrome()中测试我的脚本,但是将使用PhantomJS()或Firefox().因此,如果某些方法使用了浏览器设置的更改,那么它必须是通用的.

Also I test my script in webdriver.Chrome(), but will use PhantomJS(), or probably Firefox(). So, if some method uses a change in browser settings, then it must be universal.

推荐答案

Selenium默认加载页面/URL时,它遵循默认配置,其中pageLoadStrategy设置为normal.为了使Selenium不等待整个页面加载,我们可以配置pageLoadStrategy. pageLoadStrategy支持以下3个不同的值:

When Selenium loads a page/url by default it follows a default configuration with pageLoadStrategy set to normal. To make Selenium not to wait for full page load we can configure the pageLoadStrategy. pageLoadStrategy supports 3 different values as follows:

  1. normal (整页加载)
  2. eager (交互式)
  3. none

这是配置 pageLoadStrategy 的代码块:

Here is the code block to configure the pageLoadStrategy :

  • Firefox :

  • Firefox :

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

caps = DesiredCapabilities().FIREFOX
caps["pageLoadStrategy"] = "normal"  #  complete
#caps["pageLoadStrategy"] = "eager"  #  interactive
#caps["pageLoadStrategy"] = "none"
driver = webdriver.Firefox(desired_capabilities=caps, executable_path=r'C:\path\to\geckodriver.exe')
driver.get("http://google.com")

  • Chrome :

  • Chrome :

    from selenium import webdriver
    from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
    
    caps = DesiredCapabilities().CHROME
    caps["pageLoadStrategy"] = "normal"  #  complete
    #caps["pageLoadStrategy"] = "eager"  #  interactive
    #caps["pageLoadStrategy"] = "none"
    driver = webdriver.Chrome(desired_capabilities=caps, executable_path=r'C:\path\to\chromedriver.exe')
    driver.get("http://google.com")
    

  • 注意:pageLoadStrategy normal eager none 是根据 WebDriver W3C编辑器草案 的要求,但pageLoadStrategy的值为 eager 仍是 ChromeDriver 实施中的 WIP(正在进行中).您可以在渴望"的页面加载中找到详细的讨论适用于Python的Chromedriver Selenium的策略解决方法

    Note : pageLoadStrategy values normal, eager and none is a requirement as per WebDriver W3C Editor's Draft but pageLoadStrategy value as eager is still a WIP (Work In Progress) within ChromeDriver implementation. You can find a detailed discussion in "Eager" Page Load Strategy workaround for Chromedriver Selenium in Python

    这篇关于如何让Selenium不要等到整个页面加载后脚本运行缓慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆