Selenium Firefox无头返回不同结果 [英] Selenium Firefox headless returns different results

查看:333
本文介绍了Selenium Firefox无头返回不同结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我抓取包含使用无头选项的产品的页面时,我得到不同的结果.
对于相同的问题,我一次获得未排序的结果,而另一次获得正确排序的结果.

When i scrape page that contains products with usage of headless option i get different results.
For the same question one time i get results that are not sorted, and another time with proper sorted order.

Selenium firefox浏览器:

Selenium firefox browser:

firefox_options = Options()
firefox_options.headless = True
browser = webdriver.Firefox(options=firefox_options, executable_path=firefox_driver)

根据帖子:
使用无头选项时,firefox不会发送不同的标头".

According to this post:
"firefox does not send different headers when using the headless option".

如何使用无头选项从抓取中获得恒定的结果?

How to use headless option to get constant results from scraping?

更新:

事实证明,广告弹出窗口隐藏了价格排序菜单.通过设置 DebanjanB 发布的恒定窗口大小,解决了问题.

Its turns out that ads popup window was hiding price sort menu. With setting constant windows size as posted by DebanjanB, problem was solved.

谢谢您的建议

推荐答案

理想情况下,使用和不使用firefox_options.headless = True

Ideally, using and not using firefox_options.headless = True shouldn't have any major effect on the elements within the DOM Tree getting rendered but may have a significant difference as far as the Viewport is concerned.

例如,当GeckoDriver/Firefox与 --headless 选项一起初始化时,默认的视口 width = 1366px, height = 768px ,其中当不使用--headless选项初始化GeckoDriver/Firefox时,默认的视口 width = 1382px, height = 744px .

As an example, when GeckoDriver/Firefox is initialized along with the --headless option the default Viewport is width = 1366px, height = 768px where as when GeckoDriver/Firefox is initialized without the --headless option the default Viewport is width = 1382px, height = 744px.

  • 示例代码:

  • Example Code:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.FirefoxOptions()
options.headless = True
driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("https://www.google.com/")
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.NAME, "q")))
print ("Headless Firefox Initialized")
size = driver.get_window_size()
print("Window size: width = {}px, height = {}px".format(size["width"], size["height"]))
driver.quit()
driver = webdriver.Firefox(executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
driver.get("https://www.google.com/")
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.NAME, "q")))
print ("Firefox Initialized")
size = driver.get_window_size()
print("Window size: width = {}px, height = {}px".format(size["width"], size["height"]))
driver.quit()

  • 控制台输出:

  • Console Output:

    Headless Firefox Initialized
    Window size: width = 1366px, height = 768px
    Firefox Initialized
    Window size: width = 1382px, height = 744px
    

  • 从以上观察结果可以推断出,使用--headless选项,GeckoDriver/Firefox使用减小的视口打开了浏览上下文,因此可以确定所标识的元素数量.

    From the above observation it can be inferred that with --headless option GeckoDriver/Firefox opens the Browsing Context with reduced Viewport and hence the number of elements identified can be less.

    使用GeckoDriver/Firefox初始化浏览上下文时,始终以 maximized 模式打开或通过set_window_size()进行如下配置:

    While using GeckoDriver/Firefox to initiate a Browsing Context always open in maximized mode or configure through set_window_size() as follows:

    from selenium import webdriver
    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
    options = webdriver.FirefoxOptions()
    options.headless = True
    #options.add_argument("start-maximized")
    options.add_argument("window-size=1400,600")
    driver = webdriver.Firefox(options=options, executable_path=r'C:\Utility\BrowserDrivers\geckodriver.exe')
    driver.get("https://www.google.com/")
    driver.set_window_size(1920, 1080)
    


    tl;博士

    您可以在以下位置找到有关 窗口大小 的一些相关讨论:


    tl; dr

    You find a couple of relevant discussion on window size in:

    • python: How to set window size in Selenium Chrome Python
    • java: Not able to maximize Chrome Window in headless mode

    这篇关于Selenium Firefox无头返回不同结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆