在Python中哪个最好:urllib2,PycURL或机械化? [英] Which is best in Python: urllib2, PycURL or mechanize?
问题描述
好,所以我需要使用Python下载一些网页,并对我的选项进行了快速调查.
Ok so I need to download some web pages using Python and did a quick investigation of my options.
Python随附:
urllib -在我看来,我应该改用urllib2. urllib不支持cookie,仅HTTP/FTP/本地文件(不支持SSL)
urllib - seems to me that I should use urllib2 instead. urllib has no cookie support, HTTP/FTP/local files only (no SSL)
urllib2 -完整的HTTP/FTP客户端,支持cookie等大多数必需的功能不支持所有HTTP动词(仅支持GET和POST,不支持TRACE等)
urllib2 - complete HTTP/FTP client, supports most needed things like cookies, does not support all HTTP verbs (only GET and POST, no TRACE, etc.)
全功能:
机械化-可以使用/保存Firefox/IE cookie,采取诸如跟随第二个链接的操作,积极地保持不变(2011年3月发布0.2.5)
mechanize - can use/save Firefox/IE cookies, take actions like follow second link, actively maintained (0.2.5 released in March 2011)
PycURL -支持curl的所有功能(FTP,FTPS,HTTP,HTTPS,GOPHER,TELNET,DICT, FILE和LDAP),坏消息:自2008年9月9日(7.19.0)起未更新
PycURL - supports everything curl does (FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and LDAP), bad news: not updated since Sep 9, 2008 (7.19.0)
新的可能性:
urllib3 -支持连接重用/池化和文件发布
urllib3 - supports connection re-using/pooling and file posting
已弃用(也可以改用urllib/urllib2):
Deprecated (a.k.a. use urllib/urllib2 instead):
httplib -仅HTTP/HTTPS(无FTP)
httplib - HTTP/HTTPS only (no FTP)
httplib2 -仅HTTP/HTTPS(无FTP)
httplib2 - HTTP/HTTPS only (no FTP)
让我震惊的第一件事是urllib/urllib2/PycURL/mechanize都是相当成熟的解决方案,可以很好地工作. mechanize和PycURL附带了许多Linux发行版(例如Fedora 13)和BSD,因此安装通常不是问题(很好).
The first thing that strikes me is that urllib/urllib2/PycURL/mechanize are all pretty mature solutions that work well. mechanize and PycURL ship with a number of Linux distributions (e.g. Fedora 13) and BSDs so installation is a non issue typically (so that's good).
urllib2看起来不错,但是我想知道为什么PycURL和机械化两者似乎都很流行,我缺少什么东西(即,如果我使用urllib2,我是否会在某个时候画一个角?).我真的很想就这些事情的利弊提供一些反馈,以便为自己做出最佳选择.
urllib2 looks good but I'm wondering why PycURL and mechanize both seem very popular, is there something I am missing (i.e. if I use urllib2 will I paint myself in to a corner at some point?). I'd really like some feedback on the pros/cons of these things so I can make the best choice for myself.
在urllib2中添加了对动词支持的注释
added note on verb support in urllib2
推荐答案
-
urllib2
在随处可见的每个Python安装中都可以找到,因此这是一个良好的起点. -
PycURL
对已经习惯使用libcurl的人很有用,它公开了HTTP的更多低级细节,并且获得了应用于libcurl的任何修复或改进. -
mechanize
用于持久地驱动连接,就像浏览器一样. urllib2
is found in every Python install everywhere, so is a good base upon which to start.PycURL
is useful for people already used to using libcurl, exposes more of the low-level details of HTTP, plus it gains any fixes or improvements applied to libcurl.mechanize
is used to persistently drive a connection much like a browser would.
这不是一个比另一个更好的问题,而是选择适合该工作的工具的问题.
It's not a matter of one being better than the other, it's a matter of choosing the appropriate tool for the job.
这篇关于在Python中哪个最好:urllib2,PycURL或机械化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!