在Python中哪个最好:urllib2,PycURL或机械化? [英] Which is best in Python: urllib2, PycURL or mechanize?

查看:90
本文介绍了在Python中哪个最好:urllib2,PycURL或机械化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好,所以我需要使用Python下载一些网页,并对我的选项进行了快速调查.

Ok so I need to download some web pages using Python and did a quick investigation of my options.

Python随附:

urllib -在我看来,我应该改用urllib2. urllib不支持cookie,仅HTTP/FTP/本地文件(不支持SSL)

urllib - seems to me that I should use urllib2 instead. urllib has no cookie support, HTTP/FTP/local files only (no SSL)

urllib2 -完整的HTTP/FTP客户端,支持cookie等大多数必需的功能不支持所有HTTP动词(仅支持GET和POST,不支持TRACE等)

urllib2 - complete HTTP/FTP client, supports most needed things like cookies, does not support all HTTP verbs (only GET and POST, no TRACE, etc.)

全功能:

机械化-可以使用/保存Firefox/IE cookie,采取诸如跟随第二个链接的操作,积极地保持不变(2011年3月发布0.2.5)

mechanize - can use/save Firefox/IE cookies, take actions like follow second link, actively maintained (0.2.5 released in March 2011)

PycURL -支持curl的所有功能(FTP,FTPS,HTTP,HTTPS,GOPHER,TELNET,DICT, FILE和LDAP),坏消息:自2008年9月9日(7.19.0)起未更新

PycURL - supports everything curl does (FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and LDAP), bad news: not updated since Sep 9, 2008 (7.19.0)

新的可能性:

urllib3 -支持连接重用/池化和文件发布

urllib3 - supports connection re-using/pooling and file posting

已弃用(也可以改用urllib/urllib2):

Deprecated (a.k.a. use urllib/urllib2 instead):

httplib -仅HTTP/HTTPS(无FTP)

httplib - HTTP/HTTPS only (no FTP)

httplib2 -仅HTTP/HTTPS(无FTP)

httplib2 - HTTP/HTTPS only (no FTP)

让我震惊的第一件事是urllib/urllib2/PycURL/mechanize都是相当成熟的解决方案,可以很好地工作. mechanize和PycURL附带了许多Linux发行版(例如Fedora 13)和BSD,因此安装通常不是问题(很好).

The first thing that strikes me is that urllib/urllib2/PycURL/mechanize are all pretty mature solutions that work well. mechanize and PycURL ship with a number of Linux distributions (e.g. Fedora 13) and BSDs so installation is a non issue typically (so that's good).

urllib2看起来不错,但是我想知道为什么PycURL和机械化两者似乎都很流行,我缺少什么东西(即,如果我使用urllib2,我是否会在某个时候画一个角?).我真的很想就这些事情的利弊提供一些反馈,以便为自己做出最佳选择.

urllib2 looks good but I'm wondering why PycURL and mechanize both seem very popular, is there something I am missing (i.e. if I use urllib2 will I paint myself in to a corner at some point?). I'd really like some feedback on the pros/cons of these things so I can make the best choice for myself.

在urllib2中添加了对动词支持的注释

added note on verb support in urllib2

推荐答案

  • urllib2在随处可见的每个Python安装中都可以找到,因此这是一个良好的起点.
  • PycURL对已经习惯使用libcurl的人很有用,它公开了HTTP的更多低级细节,并且获得了应用于libcurl的任何修复或改进.
  • mechanize用于持久地驱动连接,就像浏览器一样.
    • urllib2 is found in every Python install everywhere, so is a good base upon which to start.
    • PycURL is useful for people already used to using libcurl, exposes more of the low-level details of HTTP, plus it gains any fixes or improvements applied to libcurl.
    • mechanize is used to persistently drive a connection much like a browser would.
    • 这不是一个比另一个更好的问题,而是选择适合该工作的工具的问题.

      It's not a matter of one being better than the other, it's a matter of choosing the appropriate tool for the job.

      这篇关于在Python中哪个最好:urllib2,PycURL或机械化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆