使用带有 chromedriver 的 Selenium Python 截取整页屏幕截图 [英] Take screenshot of full page with Selenium Python with chromedriver

查看:25
本文介绍了使用带有 chromedriver 的 Selenium Python 截取整页屏幕截图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试了各种方法之后……我偶然发现了这个页面,用 chromedriver、selenium 和 python 截取了整页截图.

原始代码在

test.py

<代码>"""此脚本使用此处脚本的简化版本:https://snipt.net/restrada/python-selenium-workaround-for-full-page-screenshot-using-chromedriver-2x/它包含 Jason Coutu 在评论中添加的*关键*更正."""导入系统从硒导入网络驱动程序导入单元测试导入工具类测试(unittest.TestCase):"""演示:获取Chrome生成全屏截图"""定义设置(自我):self.driver = webdriver.Chrome()定义拆卸(自我):self.driver.quit()def test_fullpage_screenshot(self):'''生成文档高度截图'''#url = "http://effbot.org/imagingbook/introduction.htm"url = "http://www.w3schools.com/js/default.asp"self.driver.get(url)util.fullpage_screenshot(self.driver,test.png")如果 __name__ == "__main__":unittest.main(argv=[sys.argv[0]])

util.py

导入操作系统导入时间从 PIL 导入图像def fullpage_screenshot(驱动程序,文件):打印(启动 chrome 整页屏幕截图解决方法......")total_width = driver.execute_script("返回 document.body.offsetWidth")total_height = driver.execute_script("返回 document.body.parentNode.scrollHeight")viewport_width = driver.execute_script("返回 document.body.clientWidth")viewport_height = driver.execute_script("返回 window.innerHeight")打印(总计:({0},{1}),视口:({2},{3})".格式(total_width,total_height,viewport_width,viewport_height))矩形 = []我 = 0当我 <总高度:ii = 0top_height = i + viewport_height如果 top_height >总高度:顶部高度 = 总高度当 ii <总宽度:顶部宽度 = ii + 视口宽度如果 top_width >总宽度:顶部宽度 = 总宽度print("附加矩形 ({0},{1},{2},{3})".format(ii, i, top_width, top_height))矩形.append((ii, i, top_width,top_height))ii = ii + viewport_widthi = i + viewport_height缝合图像 = Image.new('RGB', (total_width, total_height))上一个 = 无部分 = 0对于矩形中的矩形:如果不是以前是无:driver.execute_script("window.scrollTo({0}, {1})".format(rectangle[0], rectangle[1]))打印(滚动到({0},{1})".格式(矩形[0],矩形[1]))时间.睡眠(0.2)file_name = "part_{0}.png".format(part)打印(捕获{0} ...".格式(文件名))driver.get_screenshot_as_file(file_name)屏幕截图 = Image.open(file_name)如果矩形[1] + viewport_height >总高度:偏移量 =(矩形 [0],总高度 - 视口高度)别的:偏移=(矩形[0],矩形[1])打印(添加到具有偏移量的拼接图像({0},{1})".格式(偏移量[0],偏移量[1]))缝合图像.粘贴(截图,偏移)删除截图os.remove(file_name)部分 = 部分 + 1前一个 = 矩形缝合图像.保存(文件)打印(完成 chrome 整页屏幕截图解决方法......")返回真

解决方案

2021 年 11 月 21 日更新的新答案

安装 playwright (https://playwright.dev/python/docs/intro#installation)

使用剧作家这项工作现在很容易.你可以在这里阅读截图:https://playwright.dev/python/docs/cli#take-screenshot

使用 Powershell 运行此 playwright.exe.

当然你可以在 python 脚本中使用 playwright.这是最简单的解决方案

<代码>PS C:JupyterLab
esourcesjlab_serverScripts>.playwright.exe 截图 --full-page https://www.w3schools.com/js/default.asp test.png`

较早的回答

工作原理:尽可能将浏览器高度设置为最长...

#coding=utf-8导入时间从硒导入网络驱动程序从 selenium.webdriver.chrome.options 导入选项def test_fullpage_screenshot(self):chrome_options = 选项()chrome_options.add_argument('--headless')chrome_options.add_argument('--start-maximized')驱动程序 = webdriver.Chrome(chrome_options=chrome_options)driver.get("yoururlxxx")时间.sleep(2)#页面上高度最长的元素ele=driver.find_element("xpath", '//div[@class="react-grid-layout layout"]')total_height = ele.size[高度"]+1000driver.set_window_size(1920, total_height) #技巧时间.sleep(2)driver.save_screenshot("screenshot1.png")驱动程序退出()如果 __name__ == __main__":test_fullpage_screenshot()

After trying out various approaches... I have stumbled upon this page to take full-page screenshot with chromedriver, selenium and python.

The original code is here. (and I copy the code in this posting below)

It uses PIL and it works great! However, there is one issue... which is it captures fixed headers and repeats for the whole page and also misses some parts of the page during page change. sample url to take a screenshot:

http://www.w3schools.com/js/default.asp

How to avoid the repeated headers with this code... Or is there any better option which uses python only... ( i don't know java and do not want to use java).

Please see the screenshot of the current result and sample code below.

test.py

"""
This script uses a simplified version of the one here:
https://snipt.net/restrada/python-selenium-workaround-for-full-page-screenshot-using-chromedriver-2x/

It contains the *crucial* correction added in the comments by Jason Coutu.
"""

import sys

from selenium import webdriver
import unittest

import util

class Test(unittest.TestCase):
    """ Demonstration: Get Chrome to generate fullscreen screenshot """

    def setUp(self):
        self.driver = webdriver.Chrome()

    def tearDown(self):
        self.driver.quit()

    def test_fullpage_screenshot(self):
        ''' Generate document-height screenshot '''
        #url = "http://effbot.org/imagingbook/introduction.htm"
        url = "http://www.w3schools.com/js/default.asp"
        self.driver.get(url)
        util.fullpage_screenshot(self.driver, "test.png")


if __name__ == "__main__":
    unittest.main(argv=[sys.argv[0]])

util.py

import os
import time

from PIL import Image

def fullpage_screenshot(driver, file):

        print("Starting chrome full page screenshot workaround ...")

        total_width = driver.execute_script("return document.body.offsetWidth")
        total_height = driver.execute_script("return document.body.parentNode.scrollHeight")
        viewport_width = driver.execute_script("return document.body.clientWidth")
        viewport_height = driver.execute_script("return window.innerHeight")
        print("Total: ({0}, {1}), Viewport: ({2},{3})".format(total_width, total_height,viewport_width,viewport_height))
        rectangles = []

        i = 0
        while i < total_height:
            ii = 0
            top_height = i + viewport_height

            if top_height > total_height:
                top_height = total_height

            while ii < total_width:
                top_width = ii + viewport_width

                if top_width > total_width:
                    top_width = total_width

                print("Appending rectangle ({0},{1},{2},{3})".format(ii, i, top_width, top_height))
                rectangles.append((ii, i, top_width,top_height))

                ii = ii + viewport_width

            i = i + viewport_height

        stitched_image = Image.new('RGB', (total_width, total_height))
        previous = None
        part = 0

        for rectangle in rectangles:
            if not previous is None:
                driver.execute_script("window.scrollTo({0}, {1})".format(rectangle[0], rectangle[1]))
                print("Scrolled To ({0},{1})".format(rectangle[0], rectangle[1]))
                time.sleep(0.2)

            file_name = "part_{0}.png".format(part)
            print("Capturing {0} ...".format(file_name))

            driver.get_screenshot_as_file(file_name)
            screenshot = Image.open(file_name)

            if rectangle[1] + viewport_height > total_height:
                offset = (rectangle[0], total_height - viewport_height)
            else:
                offset = (rectangle[0], rectangle[1])

            print("Adding to stitched image with offset ({0}, {1})".format(offset[0],offset[1]))
            stitched_image.paste(screenshot, offset)

            del screenshot
            os.remove(file_name)
            part = part + 1
            previous = rectangle

        stitched_image.save(file)
        print("Finishing chrome full page screenshot workaround...")
        return True

解决方案

Updated 21Nov2021 New Answer

Install playwright (https://playwright.dev/python/docs/intro#installation)

Using playwright this job is now very easy. You can read for taking screenshot here: https://playwright.dev/python/docs/cli#take-screenshot

Using Powershell to run this playwright.exe.

Of course you can use playwright within python script. This is the easiest solution


PS C:JupyterLab
esourcesjlab_serverScripts>
 .playwright.exe screenshot --full-page https://www.w3schools.com/js/default.asp test.png`

Earlier Answer

How it works: set browser height as longest as you can...

#coding=utf-8
import time
from selenium import webdriver
from selenium.webdriver.chrome.options import Options

def test_fullpage_screenshot(self):
    chrome_options = Options()
    chrome_options.add_argument('--headless')
    chrome_options.add_argument('--start-maximized')
    driver = webdriver.Chrome(chrome_options=chrome_options)
    driver.get("yoururlxxx")
    time.sleep(2)
    
    #the element with longest height on page
    ele=driver.find_element("xpath", '//div[@class="react-grid-layout layout"]')
    total_height = ele.size["height"]+1000
    
    driver.set_window_size(1920, total_height)      #the trick
    time.sleep(2)
    driver.save_screenshot("screenshot1.png")
    driver.quit()

if __name__ == "__main__":
    test_fullpage_screenshot()

这篇关于使用带有 chromedriver 的 Selenium Python 截取整页屏幕截图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆