硒隐式等待不起作用 [英] selenium implicitly wait doesn't work
问题描述
这是我第一次使用硒和无头浏览器,因为我想使用ajax技术抓取某些网页.
This is the first time I use selenium and headless browser as I want to crawl some web page using ajax tech.
效果很好,但是在某些情况下,加载整个页面会花费太多时间(尤其是当某些资源不可用时),因此我必须为硒设置超时时间.
The effect is great, but for some case it takes too much time to load the whole page(especially when some resource is unavailable),so I have to set a time out for the selenium.
首先我尝试了set_page_load_timeout()
和set_script_timeout()
,但是当我设置这些超时时,如果页面未完全加载,我将不会得到任何页面源,如下代码:
First of all I tried set_page_load_timeout()
and set_script_timeout()
,but when I set these timeouts, I won't get any page source if the page doesn't load completely, as the codes below:
driver = webdriver.Chrome(chrome_options=options)
driver.set_page_load_timeout(5)
driver.set_script_timeout(5)
try:
driver.get(url)
except Exception:
driver.execute_script('window.stop()')
print driver.page_source.encode('utf-8') # raise TimeoutException this line.
所以我尝试使用隐式等待和条件等待,如下所示:
so I try to using Implicitly Wait and Conditional Wait, like this:
driver = webdriver.Firefox(firefox_options=options, executable_path=path)
print("Firefox Headless Browser Invoked")
wait = WebDriverWait(driver, timeout=10)
driver.implicitly_wait(2)
start = time.time()
driver.get(url)
end = time.time()
print 'time used: %s s' % str(end - start)
try:
WebDriverWait(driver, 2, 0.5).until(expected.presence_of_element_located((By.TAG_NAME, 'body')))
print driver.find_element_by_tag_name('body').text
except Exception:
driver.execute_script('window.stop()')
这一次我获得了想要的内容.但是,这需要很长时间(40+秒),这意味着我设置的2秒钟的超时时间根本不起作用.
This time I got the content that I want.However,it takes a very long time(40+ seconds),that means the timeout I set for 2 seconds doesn't work at all.
在我看来,在浏览器停止加载页面之前,driver.get()
调用似乎结束了,只有在下面的代码可以正常工作之后,您才能终止get()
调用,否则您将一无所获.
但这与硒文档有很大不同,我真的很想知道错误在哪里.
In my view, it seems like the driver.get()
call ends until the browser stop loading the page, only after that the codes below can work, and you can not kill the get()
call or you'll get nothing.
But this is very different from the selenium docs, I REALLY wonder where is the mistake.
环境:OSX 10.12,带有FireFox& amp;的硒3.0.9. GoogleChrome Headless(均为最新版本.)
environment: OSX 10.12, selenium 3.0.9 with FireFox & GoogleChrome Headless(both latest version.)
-更新----
感谢您的帮助.我仅使用WebDriverWait()
来更改代码,但仍然存在呼叫持续很长时间的情况,远远超过了我设置的超时时间.
想知道我是否可以在超时后立即停止页面加载?
Thanks for help.I change the code as below, using WebDriverWait()
alone, but there still exist cases that the call last for a very long time, far more than the timeout that I set.
Wonder if I can stop the page load immediately as the time is out?
driver = webdriver.Firefox(firefox_options=options, executable_path=path)
print("Firefox Headless Browser Invoked")
start = time.time()
driver.get('url')
end = time.time()
print 'time used: %s s' % str(end - start)
try:
WebDriverWait(driver, 2, 0.5).until(expected.presence_of_element_located((By.TAG_NAME, 'body')))
print driver.find_element_by_tag_name('body').text
except Exception:
driver.execute_script('window.stop()')
driver.quit()
这是测试中的终端输出:
Here is a terminal output in test:
Firefox Headless Browser Invoked
time used: 44.6049938202 s
根据代码,这意味着driver.get()
调用需要44秒才能完成调用,这是意外的,我想知道我是否误解了无头浏览器的行为?
according to the code this means the driver.get()
call takes 44 seconds to finish call, which is unexpected,I wonder if I misunderstood the behavior of the headless browsers?
推荐答案
就像您在问题中提到的那样加载整个页面需要花费太多时间(尤其是当某些资源不可用时)很漂亮如果待测应用程序( AUT )使用 JavaScript 或 AJAX调用,则很有可能.
As you mentioned in your question it takes too much time to load the whole page(especially when some resource is unavailable) is pretty much possible if the Application Under Test (AUT) uses JavaScript or AJAX Calls.
- 在第一种情况下,您同时诱发了
set_page_load_timeout(5)
和set_script_timeout(5)
-
set_page_load_timeout(time_to_wait)
:设置在引发异常之前等待页面加载完成的时间. -
set_script_timeout(time_to_wait)
:设置脚本在execute_async_script
调用期间应等待的时间,然后引发异常.
- In your first scenario you have induced both
set_page_load_timeout(5)
andset_script_timeout(5)
set_page_load_timeout(time_to_wait)
: Sets the amount of time to wait for a page load to complete before throwing an exception.set_script_timeout(time_to_wait)
: Sets the amount of time that the script should wait during anexecute_async_script
call before throwing an exception.
在两种情况下,被测应用程序依赖于 JavaScript 或 AJAX调用,会引发 TimeoutException .
Hence the Application Under Test being dependent on JavaScript or AJAX Calls in presence of both the conditions raises TimeoutException.
-
在第二种情况下,您同时诱发了
implicitly_wait(2)
和WebDriverWait(driver, 2, 0.5)
.
-
implicitly_wait(time_to_wait)
:将超时设置为隐式等待找到元素或命令完成. -
WebDriverWait(driver, timeout, poll_frequency=0.5, ignored_exceptions=None)
:与其他 docs 请勿混合使用隐式和显式等待,否则可能导致无法预测的等待时间
implicitly_wait(time_to_wait)
: Sets the timeout to implicitly wait for an element to be found or a command to complete.WebDriverWait(driver, timeout, poll_frequency=0.5, ignored_exceptions=None)
: Sets the timeout in-conjunction with differentexpected_conditions
- But you are experiancing a very long timeout(40+ seconds) as it is clearly mentioned in the docs Do not mix implicit and explicit waits which can cause unpredictable wait times
警告:请勿混合使用隐式和显式等待.这样做可能导致无法预测的等待时间.例如,将隐式等待设置为10秒,将显式等待设置为15秒,则可能导致20秒后发生超时.
WARNING : Do not mix implicit and explicit waits. Doing so can cause unpredictable wait times. For example setting an implicit wait of 10 seconds and an explicit wait of 15 seconds, could cause a timeout to occur after 20 seconds.
解决方案:
最好的解决方案是删除
implicitly_wait(time_to_wait)
的所有实例,并用WebDriverWait()
替换,以使被测应用程序(AUT)保持稳定.Solution :
The best solution would be to remove all the instance of
implicitly_wait(time_to_wait)
and replace withWebDriverWait()
for a stable behavior of the Application Under Test (AUT).根据您的反问题,当前代码块看起来很完美.您看到的以
time used: 44.6049938202 s
表示的时间是网页完全正常加载所需的时间,即 Client 所需的时间(即 Web浏览器),一旦'document.readyState'等于完成" WebDriver 实例>已实现. Selenium 或作为用户,您无法控制此渲染过程.但是,为了获得更好的性能,您可以遵循以下最佳做法:As per your counter question, the current code block looks perfect. The measurement of time which you are seeing as
time used: 44.6049938202 s
is the time required for the Web Page to load completely and functionally that is the time required for the Client (i.e. the Web Browser) to return back the control to the WebDriver instance once 'document.readyState' equals to "complete" is achieved. Selenium or as an user you have no control on this rendering process. However for a better performance you may follow the best practices as follows :- 保留当前更新的 JDK 版本 Selenium 3.9.0
- 保持您的 WebDriver 版本更新.
- 保持您的 Web浏览器版本更新. 定期在IDE中
- 清理您的 Project Workspace 以仅使用所需的依赖项来构建项目.
- 使用> CCleaner 工具进行擦除在执行 Test Suite 之前和之后消除操作系统琐事.
- 如果您的 Web浏览器基本版本过旧,请通过
Web浏览器. rel ="nofollow noreferrer"> Revo Uninstaller 并安装最新发布的 Web浏览器版本. - 执行您的 Test .
- Keep your JDK version updated currently Java SE Development Kit 8u162
- Keep your Selenium Client version updated currently selenium 3.9.0
- Keep your WebDriver version updated.
- Keep your Web Browser version updated.
- Clean you Project Workspace within your IDE regularly to build your project with required dependencies only.
- Use CCleaner tool to wipe away the OS chores before and after your Test Suite execution.
- If your Web Browser base version is too old uninstall the Web Browser through Revo Uninstaller and install a recent GA released version of the Web Browser.
- Execute your Test.
这篇关于硒隐式等待不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-