Selenium (Python) - 使用 Chrome 网络驱动程序等待下载过程完成 [英] Selenium (Python) - waiting for a download process to complete using Chrome web driver

查看:96
本文介绍了Selenium (Python) - 使用 Chrome 网络驱动程序等待下载过程完成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通过 chromewebdriver (windows) 使用 selenium 和 python 来自动化从不同页面下载大量文件的任务.我的代码有效,但解决方案远非理想:下面的函数点击网站按钮,启动生成PDF文件的java脚本函数,然后下载它.

I'm using selenium and python via chromewebdriver (windows) in order to automate a task of downloading large amount of files from different pages. My code works, but the solution is far from ideal: the function below clicks on the website button that initiating a java script function that generating a PDF file and then downloading it.

我不得不使用静态等待以等待下载完成(丑陋)我无法检查文件系统以验证下载何时完成,因为我使用的是多线程(下载大量文件)一次从不同的页面),而且文件的名称是在网站本身中动态生成的.

I had to use a static wait in order to wait for the download to be completed (ugly) I cannot check the file system in order to verify when the download is completed since i'm using multi threading (downloading lot's of files from different pages at once) and also the the name of the files is generated dynamically in the website itself.

我的代码:

def file_download(num, drivervar):
Counter += 1
    try:
        drivervar.get(url[num])
        download_button = WebDriverWait(drivervar, 20).until(EC.element_to_be_clickable((By.ID, 'download button ID')))
        download_button.click()
        time.sleep(10) 
    except TimeoutException: # Retry once
        print('Timeout in thread number: ' + str(num) + ', retrying...')
..... 

是否可以在 webdriver 中确定下载完成?我想避免使用 time.sleep(x).

Is it possible to determine download completion in webdriver? I want to avoid using time.sleep(x).

非常感谢.

推荐答案

您可以通过驱动程序访问 chrome://downloads/ 获取每次下载的状态.

You can get the status of each download by visiting chrome://downloads/ with the driver.

等待所有下载完成并列出所有路径:

To wait for all the downloads to finish and to list all the paths:

def every_downloads_chrome(driver):
    if not driver.current_url.startswith("chrome://downloads"):
        driver.get("chrome://downloads/")
    return driver.execute_script("""
        var items = document.querySelector('downloads-manager')
            .shadowRoot.getElementById('downloadsList').items;
        if (items.every(e => e.state === "COMPLETE"))
            return items.map(e => e.fileUrl || e.file_url);
        """)


# waits for all the files to be completed and returns the paths
paths = WebDriverWait(driver, 120, 1).until(every_downloads_chrome)
print(paths)

已更新以支持版本 81 之前的更改.

Was updated to support changes till version 81.

这篇关于Selenium (Python) - 使用 Chrome 网络驱动程序等待下载过程完成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆