从多个网站下载文件。 [英] Downloading files from multiple websites.
问题描述
这是我的第一个Python项目,所以它是非常基本和基本的。
我经常要为朋友清理病毒,我使用的免费程序经常更新。而不是手动下载每个程序,我试图创建一个简单的方法来自动化过程。由于我也试图学习python,我以为这将是一个很好的实践机会。
This is my first Python project so it is very basic and rudimentary. I often have to clean off viruses for friends and the free programs that I use are updated often. Instead of manually downloading each program, I was trying to create a simple way to automate the process. Since I am also trying to learn python I thought it would be a good opportunity to practice.
问题:
我必须找到.exe文件与一些链接。我可以找到正确的URL,但尝试下载时我收到错误。
I have to find the .exe file with some of the links. I can find the correct URL, but I get an error when it tries to download.
有没有办法将所有链接添加到列表中,然后创建一个函数来遍历列表,并在每个url上运行该函数?我已经相当了Google,我似乎无法使其工作。也许我没有在正确的方向思考?
Is there a way to add all of the links into a list, and then create a function to go through the list and run the function on each url? I've Google'd quite a bit and I just cannot seem to make it work. Maybe I am not thinking in the right direction?
import urllib, urllib2, re, os
from BeautifulSoup import BeautifulSoup
# Website List
sas = 'http://cdn.superantispyware.com/SUPERAntiSpyware.exe'
tds = 'http://support.kaspersky.com/downloads/utils/tdsskiller.exe'
mbam = 'http://www.bleepingcomputer.com/download/malwarebytes-anti-malware/dl/7/?1'
tr = 'http://www.simplysup.com/tremover/download.html'
urllist = [sas, tr, tds, tr]
urrllist2 = []
# Find exe files to download
match = re.compile('\.exe')
data = urllib2.urlopen(urllist)
page = BeautifulSoup(data)
# Check links
#def findexe():
for link in page.findAll('a'):
try:
href = link['href']
if re.search(match, href):
urllist2.append(href)
except KeyError:
pass
os.chdir(r"C:\_VirusFixes")
urllib.urlretrieve(urllist2, os.path.basename(urllist2))
正如你所看到的,我已经将功能注释掉了,因为我无法获得它正常工作
As you can see, I have left the function commented out as I cannot get it to work correctly.
我应该放弃列表并单独下载吗?我试图高效。
Should I abandon the list and just download them individually? I was trying to be efficient.
任何建议,或者如果您能指出正确的方向,那将是非常感谢。
Any suggestions or if you could point me in the right direction, it would be most appreciated.
推荐答案
除了 mikez302的回答,这是一个稍微更可读的方式来编写代码:
In addition to mikez302's answer, here's a slightly more readable way to write your code:
import os
import re
import urllib
import urllib2
from BeautifulSoup import BeautifulSoup
websites = [
'http://cdn.superantispyware.com/SUPERAntiSpyware.exe'
'http://support.kaspersky.com/downloads/utils/tdsskiller.exe'
'http://www.bleepingcomputer.com/download/malwarebytes-anti-malware/dl/7/?1'
'http://www.simplysup.com/tremover/download.html'
]
download_links = []
for url in websites:
connection = urllib2.urlopen(url)
soup = BeautifulSoup(connection)
connection.close()
for link in soup.findAll('a', {href: re.compile(r'\.exe$')}):
download_links.append(link['href'])
for url in download_links:
urllib.urlretrieve(url, r'C:\_VirusFixes', os.path.basename(url))
这篇关于从多个网站下载文件。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!