Python urllib下载一个在线目录的内容 [英] Python urllib downloading contents of an online directory
本文介绍了Python urllib下载一个在线目录的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试制作一个程序来打开一个目录,然后使用正则表达式获取powerpoint的名称,然后在本地创建文件并复制其内容。当我运行它,它似乎工作,但是,当我实际上尝试打开文件,他们说版本是错误的。
I'm trying to make a program that will open a directory, then use regular expressions to get the names of powerpoints and then create files locally and copy their content. When I run this it appears to work, however when I actually try to open the files they keep saying the version is wrong.
from urllib.request import urlopen
import re
urlpath = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/')
string = urlpath.read().decode('utf-8')
pattern = re.compile('ch[0-9]*.ppt') #the pattern actually creates duplicates in the list
filelist = pattern.findall(string)
print(filelist)
for filename in filelist:
remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename)
localfile = open(filename,'wb')
localfile.write(remotefile.read())
localfile.close()
remotefile.close()
推荐答案
这段代码对我有用我只是修改了一点,因为你是复制每个ppt文件。从urllib2导入urlopen
This code worked for me. I just modified it a little because yours was duplicating each ppt file.
from urllib2 import urlopen
import re
urlpath =urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/')
string = urlpath.read().decode('utf-8')
pattern = re.compile('ch[0-9]*.ppt"') #the pattern actually creates duplicates in the list
filelist = pattern.findall(string)
print(filelist)
for filename in filelist:
filename=filename[:-1]
remotefile = urlopen('http://www.divms.uiowa.edu/~jni/courses/ProgrammignInCobol/presentation/' + filename)
localfile = open(filename,'wb')
localfile.write(remotefile.read())
localfile.close()
remotefile.close()
这篇关于Python urllib下载一个在线目录的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文