从多个网站下载文件。 [英] Downloading files from multiple websites.

查看:118
本文介绍了从多个网站下载文件。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的第一个Python项目,所以它是非常基本和基本的。
我经常要为朋友清理病毒,我使用的免费程序经常更新。而不是手动下载每个程序,我试图创建一个简单的方法来自动化过程。由于我也试图学习python,我以为这将是一个很好的实践机会。

This is my first Python project so it is very basic and rudimentary. I often have to clean off viruses for friends and the free programs that I use are updated often. Instead of manually downloading each program, I was trying to create a simple way to automate the process. Since I am also trying to learn python I thought it would be a good opportunity to practice.

问题:

我必须找到.exe文件与一些链接。我可以找到正确的URL,但尝试下载时我收到错误。

I have to find the .exe file with some of the links. I can find the correct URL, but I get an error when it tries to download.

有没有办法将所有链接添加到列表中,然后创建一个函数来遍历列表,并在每个url上运行该函数?我已经相当了Google,我似乎无法使其工作。也许我没有在正确的方向思考?

Is there a way to add all of the links into a list, and then create a function to go through the list and run the function on each url? I've Google'd quite a bit and I just cannot seem to make it work. Maybe I am not thinking in the right direction?

import urllib, urllib2, re, os
from BeautifulSoup import BeautifulSoup

# Website List
sas = 'http://cdn.superantispyware.com/SUPERAntiSpyware.exe'
tds = 'http://support.kaspersky.com/downloads/utils/tdsskiller.exe'
mbam = 'http://www.bleepingcomputer.com/download/malwarebytes-anti-malware/dl/7/?1'
tr = 'http://www.simplysup.com/tremover/download.html'
urllist = [sas, tr, tds, tr]
urrllist2 = []

# Find exe files to download

match = re.compile('\.exe')
data = urllib2.urlopen(urllist)
page = BeautifulSoup(data)

# Check links
#def findexe():
for link in page.findAll('a'):
    try:
        href = link['href']
        if re.search(match, href):
            urllist2.append(href)

    except KeyError:
        pass

os.chdir(r"C:\_VirusFixes")
urllib.urlretrieve(urllist2, os.path.basename(urllist2))

正如你所看到的,我已经将功能注释掉了,因为我无法获得它正常工作

As you can see, I have left the function commented out as I cannot get it to work correctly.

我应该放弃列表并单独下载吗?我试图高效。

Should I abandon the list and just download them individually? I was trying to be efficient.

任何建议,或者如果您能指出正确的方向,那将是非常感谢。

Any suggestions or if you could point me in the right direction, it would be most appreciated.

推荐答案

除了 mikez302的回答,这是一个稍微更可读的方式来编写代码:

In addition to mikez302's answer, here's a slightly more readable way to write your code:

import os
import re
import urllib
import urllib2

from BeautifulSoup import BeautifulSoup

websites = [
    'http://cdn.superantispyware.com/SUPERAntiSpyware.exe'
    'http://support.kaspersky.com/downloads/utils/tdsskiller.exe'
    'http://www.bleepingcomputer.com/download/malwarebytes-anti-malware/dl/7/?1'
    'http://www.simplysup.com/tremover/download.html'
]

download_links = []

for url in websites:
    connection = urllib2.urlopen(url)
    soup = BeautifulSoup(connection)
    connection.close()

    for link in soup.findAll('a', {href: re.compile(r'\.exe$')}):
        download_links.append(link['href'])

for url in download_links:
    urllib.urlretrieve(url, r'C:\_VirusFixes', os.path.basename(url))

这篇关于从多个网站下载文件。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆