如何在python 3.0中授权通过http下载文件,解决错误? [英] How to download a file over http with authorization in python 3.0, working around bugs?

查看:36
本文介绍了如何在python 3.0中授权通过http下载文件,解决错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个脚本,我想继续使用,但看起来我要么必须为 Python 3 中的错误找到一些解决方法,要么降级回 2.6,因此也必须降级其他脚本...

I have a script that I'd like to continue using, but it looks like I either have to find some workaround for a bug in Python 3, or downgrade back to 2.6, and thus having to downgrade other scripts as well...

希望这里有人已经设法找到了解决方法.

Hopefully someone here have already managed to find a workaround.

问题在于,由于 Python 3.0 中有关字节和字符串的新变化,显然并非所有库代码都经过测试.

The problem is that due to the new changes in Python 3.0 regarding bytes and strings, not all the library code is apparently tested.

我有一个从 Web 服务器下载页面的脚本.此脚本在 python 2.6 中将用户名和密码作为 url 的一部分传递,但在 Python 3.0 中,这不再起作用.

I have a script that downloades a page from a web server. This script passed a username and password as part of the url in python 2.6, but in Python 3.0, this doesn't work any more.

例如:

import urllib.request;
url = "http://username:password@server/file";
urllib.request.urlretrieve(url, "temp.dat");

由于此异常而失败:

Traceback (most recent call last):
  File "C:\Temp\test.py", line 5, in <module>
    urllib.request.urlretrieve(url, "test.html");
  File "C:\Python30\lib\urllib\request.py", line 134, in urlretrieve
    return _urlopener.retrieve(url, filename, reporthook, data)
  File "C:\Python30\lib\urllib\request.py", line 1476, in retrieve
    fp = self.open(url, data)
  File "C:\Python30\lib\urllib\request.py", line 1444, in open
    return getattr(self, name)(url)
  File "C:\Python30\lib\urllib\request.py", line 1618, in open_http
    return self._open_generic_http(http.client.HTTPConnection, url, data)
  File "C:\Python30\lib\urllib\request.py", line 1576, in _open_generic_http
    auth = base64.b64encode(user_passwd).strip()
  File "C:\Python30\lib\base64.py", line 56, in b64encode
    raise TypeError("expected bytes, not %s" % s.__class__.__name__)
TypeError: expected bytes, not str

显然,base64 编码现在需要输入字节并输出一个字符串,因此 urlretrieve(或其中的一些代码)构建了一个用户名:密码字符串,并尝试对其进行 base64 编码以进行简单授权,失败了.

Apparently, base64-encoding now needs bytes in and outputs a string, and thus urlretrieve (or some code therein) which builds up a string of username:password, and tries to base64-encode this for simple authorization, fails.

如果我改为尝试使用 urlopen,如下所示:

If I instead try to use urlopen, like this:

import urllib.request;
url = "http://username:password@server/file";
f = urllib.request.urlopen(url);
contents = f.read();

然后它失败并出现此异常:

Then it fails with this exception:

Traceback (most recent call last):
  File "C:\Temp\test.py", line 5, in <module>
    f = urllib.request.urlopen(url);
  File "C:\Python30\lib\urllib\request.py", line 122, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python30\lib\urllib\request.py", line 359, in open
    response = self._open(req, data)
  File "C:\Python30\lib\urllib\request.py", line 377, in _open
    '_open', req)
  File "C:\Python30\lib\urllib\request.py", line 337, in _call_chain
    result = func(*args)
  File "C:\Python30\lib\urllib\request.py", line 1082, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "C:\Python30\lib\urllib\request.py", line 1051, in do_open
    h = http_class(host, timeout=req.timeout) # will parse host:port
  File "C:\Python30\lib\http\client.py", line 620, in __init__
    self._set_hostport(host, port)
  File "C:\Python30\lib\http\client.py", line 632, in _set_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
http.client.InvalidURL: nonnumeric port: 'password@server'

显然,这个下一代 url 检索库"中的 url 解析不知道如何处理 url 中的用户名和密码.

Apparently the url parsing in this "next gen url retrieval library" doesn't know what to do with username and passwords in the url.

我还有哪些选择?

推荐答案

直接来自 Py3k 文档:http://docs.python.org/dev/py3k/library/urllib.request.html#examples

Direct from the Py3k docs: http://docs.python.org/dev/py3k/library/urllib.request.html#examples

import urllib.request
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(realm='PDQ Application',
                          uri='https://mahler:8092/site-updates.py',
                          user='klem',
                          passwd='kadidd!ehopper')
opener = urllib.request.build_opener(auth_handler)
# ...and install it globally so it can be used with urlopen.
urllib.request.install_opener(opener)
urllib.request.urlopen('http://www.example.com/login.html')

这篇关于如何在python 3.0中授权通过http下载文件,解决错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆