我如何使用Selenium和Chrome下载某些内容？ [英] How can I download something with Selenium and Chrome?

查看：359 发布时间：2018/5/8 18:41:02 python google-chrome selenium download

本文介绍了我如何使用Selenium和Chrome下载某些内容？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

第一步，我尝试设置默认的下载文件夹。

我尝试了5个选项，但它们都没有工作：

 ＃！/ usr / bin / env python 
＃ -  *  -  coding：utf-8  -  *  -  
 
 Selenium示例用于下载网页。来自selenium的
 
导入webdriver 
来自selenium.webdriver.common.keys从selenium.webdriver.common.action_chains导入关键字
导入ActionChains 
导入os 
导入时间
 
 
 def main（）：
下载打开的PDF页面。
 browser = get_browser（）
 url =https://martin-thoma.com/pdf/cv-curriculum-vitae.pdf
 browser.get（url）＃打开PDF页面
 $ el = browser.find_element_by_id（plugin）
 time.sleep（5）
 ActionChains（browser）.send_keys（Keys.CONTROL，s）。perform（）
 print（browser.current_url）
 time.sleep（60）＃让浏览器开启60s 
 
 $ b $ def get_browser（）：
获取浏览器（一个司机）。 
＃找到'哪个chromedriver'的路径
 path_to_chromedriver =（'/ home / moose / GitHub / algorithms / scraping /'
'venv / bin / chromedriver'）
 download_dir =/ home / moose / selenium-download /
 print（Is directory：{}。format（os.path.isdir（download_dir）））
 
 fail = 6 
 options = None 
 desired_caps = None 
 if fail == 1：
＃失败（1）
 os.environ ['XDG_DOWNLOAD_DIR'] = download_dir 
 elif fail == 2：
＃Fail（2）
 options = webdriver.ChromeOptions（）
 options.add_argument（download.default_directory = {）
 .format（ ）
 elif fail == 3：
＃失败（3）
选项= webdriver.ChromeOptions（）
 prefs = {download.default_directory：download_dir} 
 options.add_experimental_option（prefs，prefs）
 elif fail == 4：
＃F ail（4）
 desired_caps = {'prefs'：
 {'download'：{'default_directory'：download_dir，
'directory_upgrade'：true，
'extensions_to_open' ：}}} 
 elif fail == 5：
＃失败（5）
 desired_caps = {'prefs'：
 {'download.default_directory'：download_dir}} 
 elif fail == 6：
＃失败（6）
 desired_caps = {'prefs'：
 {'download'：{'default_directory'：download_dir，
 'directory_upgrade'：True，
'extensions_to_open'：}}} 
 
 browser = webdriver.Chrome（executable_path = path_to_chromedriver，
 chrome_options = options，
 desired_capabilities = desired_caps）
返回浏览器
 
 
 if __name__ =='__main__'：
 main（）

我知道有一些简单的方法可以通过URL下载PDF。但是，我真正的用例更加复杂，下载是由JavaScript生成的点击触发的，这是一个3步登录过程背后的链接，完全由JavaScript完成。

所以这个问题有两个方面：

如何使用Selenium和Chrome（在Ubuntu 16.04上）更改默认下载目录？ li>
如何下载打开的PDF？（我尝试了一个操作链，但它不起作用）

我有 Google Chrome版本59.0。 3071.115（官方版本）（64位），通过pip安装程序下载。 您需要从selenium.webdriver.chrome.options中导入选项

然后在 get_browser（）中将整个if块和浏览器初始化更改为：

 chrome_options = Options（）
 chrome_options.add_experimental_option（'prefs'，{
plugins.plugins_list：[{enabled：False，name： Chrome PDF Viewer}]，
download：{
prompt_for_download：False，
default_directory：download_dir 
} 
}）
 
 browser = webdriver.Chrome（path_to_chromedriver，chrome_options = chrome_options）

（我使用Windows，但是不应该有任何分歧。）

As a first step, I tried to set the default download folder.

I tried 5 options but none of them worked:
#!/usr/bin/env python # -*- coding: utf-8 -*- """Selenium example for downloading a webpage.""" from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.webdriver.common.action_chains import ActionChains import os import time def main(): """Download an opened PDF page.""" browser = get_browser() url = "https://martin-thoma.com/pdf/cv-curriculum-vitae.pdf" browser.get(url) # Open a PDF page # el = browser.find_element_by_id("plugin") time.sleep(5) ActionChains(browser).send_keys(Keys.CONTROL, "s").perform() print(browser.current_url) time.sleep(60) # Keep the browser open for 60s def get_browser(): """Get the browser (a "driver").""" # find the path with 'which chromedriver' path_to_chromedriver = ('/home/moose/GitHub/algorithms/scraping/' 'venv/bin/chromedriver') download_dir = "/home/moose/selenium-download/" print("Is directory: {}".format(os.path.isdir(download_dir))) fail = 6 options = None desired_caps = None if fail == 1: # Fail (1) os.environ['XDG_DOWNLOAD_DIR'] = download_dir elif fail == 2: # Fail (2) options = webdriver.ChromeOptions() options.add_argument("download.default_directory={}" .format(download_dir)) elif fail == 3: # Fail (3) options = webdriver.ChromeOptions() prefs = {"download.default_directory": download_dir} options.add_experimental_option("prefs", prefs) elif fail == 4: # Fail (4) desired_caps = {'prefs': {'download': {'default_directory': download_dir, 'directory_upgrade': "true", 'extensions_to_open': ""}}} elif fail == 5: # Fail (5) desired_caps = {'prefs': {'download.default_directory': download_dir}} elif fail == 6: # Fail (6) desired_caps = {'prefs': {'download': {'default_directory': download_dir, 'directory_upgrade': True, 'extensions_to_open': ""}}} browser = webdriver.Chrome(executable_path=path_to_chromedriver, chrome_options=options, desired_capabilities=desired_caps) return browser if __name__ == '__main__': main()
I know there are simpler ways to download a PDF by URL. However, my real usecase is much more complicated and the download is triggered by a javascript generated click on a link behind a 3-step login process which is purely done with JavaScript.

So this question has two aspects:

How do I change the default download directory with Selenium and Chrome (on Ubuntu 16.04)?

How do I download an opened PDF? (I tried an action chain, but it doesn't work)

I have Google Chrome Version 59.0.3071.115 (Official Build) (64-bit), downloaded via the pip installer.
解决方案
First you need

from selenium.webdriver.chrome.options import Options

And change the whole if block and the browser initialization in get_browser() to this:
chrome_options = Options() chrome_options.add_experimental_option('prefs', { "plugins.plugins_list": [{"enabled":False,"name":"Chrome PDF Viewer"}], "download": { "prompt_for_download": False, "default_directory" : download_dir } }) browser = webdriver.Chrome(path_to_chromedriver, chrome_options=chrome_options)
(I use Windows but there shouldn't be any differences.)

这篇关于我如何使用Selenium和Chrome下载某些内容？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

我如何使用Selenium和Chrome下载某些内容？ [英] How can I download something with Selenium and Chrome?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

我如何使用Selenium和Chrome下载某些内容？ [英] How can I download something with Selenium and Chrome?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭