如何在Python中读取Selenium Webdriver下载的文件 [英] How to read a file downloaded by selenium webdriver in python

查看：411 发布时间：2020/7/27 20:57:29 python selenium selenium-webdriver web-scraping webdriver

本文介绍了如何在Python中读取Selenium Webdriver下载的文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在将selenium与python中的webdriver一起使用，以从站点下载csv文件.该文件将下载到指定的下载目录中.这是我的代码的概述

I am using selenium with webdriver in python to download a csv file from a site . The file gets downloaded into the download directory specified. Here is an overview of my code

fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False)
fp.set_preference("browser.download.dir",'xx/yy')
fp.set_preference('browser.helperApps.neverAsk.saveToDisk', "text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream")
driver = webdriver.Firefox(fp)
driver.get('url')

我需要将该csv的内容print到终端.许多具有随机名称的相似文件将被下载到同一文件夹中，因此无法通过文件名访问文件，因为我不知道它将是什么

I need to print the contents of this csv to the terminal . A lot of similar files with random names will be downloaded into the same folder so accessing the file via filename wont work as I don't know what it will be in advance

推荐答案

此答案是由以前的堆栈溢出问题，答案以及本文中的注释组成的，所以谢谢大家.

This answer was formed from a combination of previous stack overflow questions , answers as well as comments in this post so thank you everyone.

针对此解决方案，我将selenium webdriver和python request模块结合在一起.我实质上是使用Selenium登录该站点的，从webdriver会话中复制了cookie，然后使用了request.get(url，cookies = webdriver_cookies)来获取文件.

I combined selenium webdriver and the python requests module for this solution . I essentially logged into the site using selenium, copied the cookies from the webdriver session and then used a requests.get(url,cookies = webdriver_cookies) to get the file.

这是我解决方案的要旨

fp = webdriver.FirefoxProfile() 
fp.set_preference("browser.download.folderList", 2)
fp.set_preference("browser.download.manager.showWhenStarting", False) 
fp.set_preference("browser.download.dir",'xx/yy') 
fp.set_preference('browser.helperApps.neverAsk.saveToDisk', "text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream") 
driver = webdriver.Firefox(fp)

# selenium login code ...

driver_cookies = driver.get_cookies()
cookies_copy = {}
for driver_cookie in driver_cookies:
    cookies_copy[driver_cookie["name"]] = driver_cookie["value"]
r = requests.get('url',cookies = cookies_copy)
print r.text

我希望这对某人有帮助

这篇关于如何在Python中读取Selenium Webdriver下载的文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在Python中读取Selenium Webdriver下载的文件 [英] How to read a file downloaded by selenium webdriver in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在Python中读取Selenium Webdriver下载的文件 [英] How to read a file downloaded by selenium webdriver in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭