在Selenium Webdriver上设置超时 [英] Setting timeout on selenium webdriver.PhantomJS
问题描述
情况
The situation
我有一个简单的python脚本来获取给定URL的HTML源:
I have a simple python script to get the HTML source for a given url:
browser = webdriver.PhantomJS()
browser.get(url)
content = browser.page_source
有时,URL指向外部资源加载缓慢的页面(例如,视频文件或非常慢的广告内容).
Occasionally, the url points to a page with slow-loading external resources (e.g. video files, or really slow advertising content).
Webdriver将等待直到加载这些资源,然后再完成.get(url)
请求.
Webdriver will wait until those resources are loaded before completing the .get(url)
request.
注意:出于种种原因,我需要使用PhantomJS而不是requests
或urllib2
Note: For extraneous reasons, I need to do this with PhantomJS rather than requests
or urllib2
问题
The question
我想在PhantomJS资源加载上设置一个超时时间,这样,如果资源加载时间太长,浏览器就会认为该资源不存在或任何其他原因.
I'd like to set a timeout on PhantomJS resource loading so that if the resource is taking too long to load, the browser just assumes it doesn't exist or whatever.
这将使我能够基于浏览器加载的内容执行后续的.pagesource
查询.
This would allow me to perform the subsequent .pagesource
query based on what the browser has loaded.
文档在webdriver上.PhantomJS非常薄,我还没有在SO上找到类似的问题.
Documentation on webdriver.PhantomJS is very thin, and I haven't found a similar question on SO.
先谢谢!
推荐答案
PhantomJS提供了resourceTimeout
,它可能适合您的需求.我从文档此处
PhantomJS has provided resourceTimeout
, which might suit your needs. I quote from documentation here
(以毫秒为单位)定义超时,在此超时之后请求任何资源 将停止尝试并继续页面的其他部分. 超时时将调用onResourceTimeout回调.
(in milli-secs) defines the timeout after which any resource requested will stop trying and proceed with other parts of the page. onResourceTimeout callback will be called on timeout.
因此在Ruby中,您可以做类似
So in Ruby, you can do something like
require 'selenium-webdriver'
capabilities = Selenium::WebDriver::Remote::Capabilities.phantomjs("phantomjs.page.settings.resourceTimeout" => "5000")
driver = Selenium::WebDriver.for :phantomjs, :desired_capabilities => capabilities
我相信Python,就像(未经测试,仅提供逻辑,您是Python开发人员,希望您能弄清楚)
I believe in Python, it's something like (untested, only provides the logic, you are the Python developer, hopefully you will figure out)
driver = webdriver.PhantomJS(desired_capabilities={'phantomjs.page.settings.resourceTimeout': '5000'})
这篇关于在Selenium Webdriver上设置超时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!