如何使用硒下载文件? [英] How to download file using selenium?

查看:159
本文介绍了如何使用硒下载文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获得下载链接,下载文件。

I am trying to get the download link and download the files.

我哈瓦包含以下链接的日志文件:

I hava a log file which contains following links:

http://www.downloadcrew.com/article/18631-aida64
http://www.downloadcrew.com/article/4475-sumo
http://www.downloadcrew.com/article/2174-iolo_system_mechanic_professional
...
...

我有一个code是这样的:

I have a code like this:

import urllib, time

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

f = open("dcrewtest.txt")

for line in f.readlines():
    try:
        driver.find_element_by_xpath("//div/div[2]/div[2]/div[2]/div[3]/div/a/img").click()
        time.sleep(8)
    except:
        pass 

     url = line.encode
     pageurl = urllib.urlopen(url).read()
     soup = BeautifulSoup(pageurl)
     for a in soup.select("h1#articleTitle"):
         print a.contents[0].strip()

     for b in soup.findAll("th"):
         if b.text == "Date Updated:":
            print b.parent.td.text
         elif b.text == "Developer:":
            print c.parent.td.text

截至这里我不知道怎么去下载链接,下载它。
是否有可能使用硒下载文件?

Up till here I do not know how to get the download link and download it. Is it possible to download the file using selenium?

推荐答案

据<一个href=\"https://selenium-python.readthedocs.org/en/latest/faq.html?highlight=profile#how-to-auto-save-files-using-custom-firefox-profile\"相对=nofollow>文档,你应该 FirefoxProfile 配置为自动下载一个指定的内容类型的文件。下面是使用你的第一个URL在保存在当前目录下的 exe文件文件txt文件的例子:

According to documentation, you should configure FirefoxProfile to automatically download files with a specified content-type. Here's an example using your first URL in the txt file that saves the exe file in the current directory:

import os
from selenium import webdriver


fp = webdriver.FirefoxProfile()

fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir", os.getcwd())
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/x-msdos-program")

driver = webdriver.Firefox(firefox_profile=fp)
driver.get("http://www.downloadcrew.com/article/18631-aida64")

driver.find_element_by_xpath("//div[@class='downloadLink']/a/img").click()

请注意,我也已经简化了的XPath。

Note, that I've also simplified the xpath.

这篇关于如何使用硒下载文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆