Python 和 urllib [英] Python and urllib
问题描述
我正在尝试从 ftp 下载 zip 文件(tl_2008_01001_edges.zip")人口普查 站点.当我得到 zip 文件时,它是什么形式的,我该如何保存它?
我对 Python 还很陌生,不了解 urllib 的工作原理.
这是我的尝试:
导入 urllib, syszip_file = urllib.urlretrieve("ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/Autauga_County/", "tl_2008_01001_edges.zip")
如果我知道 ftp 文件夹列表(或在这种情况下是县),我可以通过 ftp 站点 列表使用 glob 函数?
谢谢.
使用 urllib2.urlopen()
用于压缩文件数据和目录列表.
获取目录:
<预><代码>>>>files = urllib2.urlopen('ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/').read().splitlines()>>>for l in files[:4]: 打印 l...drwxrwsr-x 2 0 4009 4096 2008 年 11 月 26 日 01001_Autauga_Countydrwxrwsr-x 2 0 4009 4096 2008 年 11 月 26 日 01003_Baldwin_Countydrwxrwsr-x 2 0 4009 4096 2008 年 11 月 26 日 01005_Barbour_Countydrwxrwsr-x 2 0 4009 4096 2008 年 11 月 26 日 01007_Bibb_County>>>或者,拆分目录名称:
<预><代码>>>>对于文件中的 l[:4]:打印 l.split()[-1]...01001_Autauga_County01003_Baldwin_County01005_Barbour_County01007_Bibb_CountyI'm trying to download a zip file ("tl_2008_01001_edges.zip") from an ftp census site using urllib. What form is the zip file in when I get it and how do I save it?
I'm fairly new to Python and don't understand how urllib works.
This is my attempt:
import urllib, sys
zip_file = urllib.urlretrieve("ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/Autauga_County/", "tl_2008_01001_edges.zip")
If I know the list of ftp folders (or counties in this case), can I run through the ftp site list using the glob function?
Thanks.
Use urllib2.urlopen()
for the zip file data and directory listing.
To process zip files with the zipfile
module, you can write them to a disk file which is then passed to the zipfile.ZipFile
constructor.
Retrieving the data is straightforward using read()
on the file-like object returned
by urllib2.urlopen()
.
Fetching directories:
>>> files = urllib2.urlopen('ftp://ftp2.census.gov/geo/tiger/TIGER2008/01_ALABAMA/').read().splitlines()
>>> for l in files[:4]: print l
...
drwxrwsr-x 2 0 4009 4096 Nov 26 2008 01001_Autauga_County
drwxrwsr-x 2 0 4009 4096 Nov 26 2008 01003_Baldwin_County
drwxrwsr-x 2 0 4009 4096 Nov 26 2008 01005_Barbour_County
drwxrwsr-x 2 0 4009 4096 Nov 26 2008 01007_Bibb_County
>>>
Or, splitting for directory names:
>>> for l in files[:4]: print l.split()[-1]
...
01001_Autauga_County
01003_Baldwin_County
01005_Barbour_County
01007_Bibb_County
这篇关于Python 和 urllib的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!