在 Python 和 PhantomJS 中使用 Selenium 将文件下载到文件系统 [英] Using Selenium with Python and PhantomJS to download file to filesystem

查看:21
本文介绍了在 Python 和 PhantomJS 中使用 Selenium 将文件下载到文件系统的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在努力使用 PhantomJS/Selenium/python-selenium 将文件下载到文件系统.我能够轻松地浏览 DOM 并单击、悬停等.但是,事实证明下载文件非常麻烦.我已经尝试过使用 Firefox 和 pyvirtualdisplay 的无头方法,但效果不佳,而且速度慢得令人难以置信.我知道 CasperJS 允许文件下载.有谁知道如何将 CasperJS 与 Python 集成或如何利用 PhantomJS 下载文件.非常感谢.

I've been grappling with using PhantomJS/Selenium/python-selenium to download a file to the filesystem. I'm able to easily navigate through the DOM and click, hover etc. Downloading a file is, however, proving to be quite troublesome. I've tried a headless approach with Firefox and pyvirtualdisplay but that wasn't working well either and was unbelievably slow. I know That CasperJS allows for file downloads. Does anyone know how to integrate CasperJS with Python or how to utilize PhantomJS to download files. Much appreciated.

推荐答案

尽管这个问题已经很老了,但是通过PhantomJS下载文件仍然是一个问题.但是我们可以使用 PhantomJS 来获取下载链接并获取所有需要的 cookie,例如 csrf 令牌等.然后我们可以使用 requests 来实际下载它:

Despite this question is quite old, downloading files through PhantomJS is still a problem. But we can use PhantomJS to get download link and fetch all needed cookies such as csrf tokens and so on. And then we can use requests to download it actually:

import requests
from selenium import webdriver

driver = webdriver.PhantomJS()
driver.get('page_with_download_link')
download_link = driver.find_element_by_id('download_link')
session = requests.Session()
cookies = driver.get_cookies()

for cookie in cookies: 
    session.cookies.set(cookie['name'], cookie['value'])
response = session.get(download_link)

现在在 response.content 中应该出现实际的文件内容.我们接下来可以用 open 编写它或做任何我们想做的事情.

And now in response.content actual file content should appear. We can next write it with open or do whatever we want.

这篇关于在 Python 和 PhantomJS 中使用 Selenium 将文件下载到文件系统的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆