使用python硒和Firefox或Chrome浏览器获取整个页面的截图 [英] Taking screenshot of whole page with python selenium and Firefox or Chrome headless

查看:354
本文介绍了使用python硒和Firefox或Chrome浏览器获取整个页面的截图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此帖子与此相关:

Python硒屏幕截图无法获取整页

使用PhantomsJS的解决方案似乎可以正常工作:

The solution with PhantomsJS seems to be working:

driver = webdriver.PhantomJS()    
driver.maximize_window()
driver.get('http://www.angelfire.com/super/badwebs/')  
scheight = .1
while scheight < 9.9:
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight/%s);" % scheight)
    scheight += .01        
driver.save_screenshot('angelfire_phantomjs.png')

但是该解决方案是从2014年开始的,同时不建议使用PhantomJS.我正在收到此错误消息:

However the solution is from 2014 and PhantomJS is meanwhile deprecated. I'm getting namely this error message:

...
UserWarning: Selenium support for PhantomJS has been deprecated, please use headless versions of Chrome or Firefox instead
warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless '

如果我尝试适应例如像这样的无头Firefox:

If I try to adapt to e.g. Firefox headless like this:

from selenium import webdriver

firefox_options = webdriver.FirefoxOptions()
firefox_options.set_headless() 
firefox_driver = webdriver.Firefox(firefox_options=firefox_options)

firefox_driver.get('http://www.angelfire.com/super/badwebs/')  
scheight = .1
while scheight < 9.9:
    firefox_driver.execute_script("window.scrollTo(0, document.body.scrollHeight/%s);" % scheight)
    scheight += .01        
firefox_driver.save_screenshot('angelfire_firefox.png')

创建了一个屏幕截图,但没有整个页面.

a screenshot is made but not of the whole page.

有什么主意如何使其与Firefox或Chrome浏览器兼容?

Any ideas how to make it work with Firefox or Chrome headless?

(P.S.我也发现了这篇文章:

(P.S. I also found this post:

使用Selenium Python(Chromedriver)获取整页截图)

但它似乎不是一个通用的解决方案,它要复杂得多.

but it doesn't seem to be a general solution and it is much more complicated.)

推荐答案

这是我想出的方法,它可以完美地捕获任意长度的网站屏幕截图.它利用了无头浏览器可以在运行之前将窗口设置为任意大小这一事实,这是在运行无头浏览器之前获取滚动高度的挑战.这是唯一的缺点,该站点运行了两次.

This is the method i came up with that takes a perfect screenshot of website with any length. It takes advantage of the fact that headless browser can set the window to any size before it runs, the challenge is to get the scroll height before running headless browsers. This is the only draw back, running the site twice.

from selenium import webdriver
from PIL import Image
from selenium.webdriver.chrome.options import Options
import time

url = 'any website url'

#run first time to get scrollHeight
driver = webdriver.Chrome()
driver.get(url)
#pause 3 second to let page load
time.sleep(3)
#get scroll Height
height = driver.execute_script("return Math.max( document.body.scrollHeight, document.body.offsetHeight, document.documentElement.clientHeight, document.documentElement.scrollHeight, document.documentElement.offsetHeight )")
print(height)
#close browser
driver.close()

#Open another headless browser with height extracted above
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument(f"--window-size=1920,{height}")
chrome_options.add_argument("--hide-scrollbars")
driver = webdriver.Chrome(options=chrome_options)

driver.get(url)
#pause 3 second to let page loads
time.sleep(3)
#save screenshot
driver.save_screenshot('screen_shot.png')
driver.close()

这篇关于使用python硒和Firefox或Chrome浏览器获取整个页面的截图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆