使用Selenium Webdriver下载时命名文件 [英] naming a file when downloading with Selenium Webdriver

查看:2540
本文介绍了使用Selenium Webdriver下载时命名文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看到您可以设置通过Webdriver下载文件的位置,如下所示:

I see that you can set where to download a file to through Webdriver, as follows:

fp = webdriver.FirefoxProfile()

fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir",getcwd())
fp.set_preference("browser.helperApps.neverAsk.saveToDisk","text/csv")

browser = webdriver.Firefox(firefox_profile=fp)

但是,我想知道下载文件时是否有类似的方式为文件命名?最好不要与配置文件相关联,因为我将通过一个浏览器实例下载约6000个文件,并且不想每次下载都重新启动驱动程序.

But, I was wondering if there is a similar way to give the file a name when it is downloaded? Preferably, probably not something that is associated with the profile, as I will be downloading ~6000 files through one browser instance, and do not want to have to reinitiate the driver for each download.

选择的答案所建议的带有代码的解决方案.下载完每个文件后,重命名该文件.

Solution with code as suggested by the selected answer. Rename the file after each one is downloaded.

import os
os.chdir(SAVE_TO_DIRECTORY)
files = filter(os.path.isfile, os.listdir(SAVE_TO_DIRECTORY))
files = [os.path.join(SAVE_TO_DIRECTORY, f) for f in files] # add path to each file
files.sort(key=lambda x: os.path.getmtime(x))
newest_file = files[-1]
os.rename(newest_file, docName+".pdf")

推荐答案

我不知道是否有一个纯Selenium处理程序,但这是我需要对下载的文件进行处理时要做的事情./p>

I do not know if there is a pure Selenium handler for this, but here is what I have done when I needed to do something with the downloaded file.

  1. 设置一个循环,轮询您的下载目录以查找扩展名为.part的最新文件(这表示部分下载,如果不考虑,有时会导致下载失败) .为此设置一个计时器,以确保在超时/其他错误导致下载无法完成的情况下,您不会陷入无限循环.我在Linux中使用了ls -t <dirname>命令的输出(我的旧代码使用commands,它已被弃用,所以我在这里不显示它:))并通过使用

  1. Set a loop that polls your download directory for the latest file that does not have a .part extension (this indicates a partial download and would occasionally trip things up if not accounted for. Put a timer on this to ensure that you don't go into an infinite loop in the case of timeout/other error that causes the download not to complete. I used the output of the ls -t <dirname> command in Linux (my old code uses commands, which is deprecated so I won't show it here :) ) and got the first file by using

# result = output of ls -t
result = result.split('\n')[1].split(' ')[-1]

  • 如果while循环成功退出,则目录中最上面的文件将是您的文件,然后您可以使用os.rename(或您喜欢的其他任何文件)对其进行修改.

  • If the while loop exits successfully, the topmost file in the directory will be your file, which you can then modify using os.rename (or anything else you like).

    可能不是您要找的答案,但希望它能为您指明正确的方向.

    Probably not the answer you were looking for, but hopefully it points you in the right direction.

    这篇关于使用Selenium Webdriver下载时命名文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆