urllib无法读取https [英] urllib cannot read https
问题描述
(Python 3.4.2)
是否有人能够帮助我使用urllib获取https页面?我花了好几个小时试图解决这个问题。
(Python 3.4.2) Would anyone be able to help me fetch https pages with urllib? I've spent hours trying to figure this out.
这是我正在尝试做的事情(很基本):
Here's what I'm trying to do (pretty basic):
import urllib.request
url = "".join((baseurl, other_string, midurl, query))
response = urllib.request.urlopen(url)
html = response.read()
这是我运行时的错误输出:
Here's my error output when I run it:
File "./script.py", line 124, in <module>
response = urllib.request.urlopen(url)
File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 455, in open
response = self._open(req, data)
File "/usr/lib/python3.4/urllib/request.py", line 478, in _open
'unknown_open', req)
File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain
result = func(*args)
File "/usr/lib/python3.4/urllib/request.py", line 1244, in unknown_open
raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: 'https>
我也尝试使用data = None无济于事:
I've also tried using data=None to no avail:
response = urllib.request.urlopen(url, data=None)
我也试过这个:
import urllib.request, ssl
https_sslv3_handler = urllib.request.HTTPSHandler(context=ssl.SSLContext(ssl.PROTOCOL_SSLv3))
opener = urllib.request.build_opener(https_sslv3_handler)
urllib.request.install_opener(opener)
resp = opener.open(url)
html = resp.read().decode('utf-8')
print(html)
此^脚本出现类似的错误,其中错误发生在resp = ...行,并抱怨'https'是未知的url类型。
A similar error occurs with this^ script, where the error is found on the "resp = ..." line and complains that 'https' is an unknown url type.
Python是在我的计算机(Arch Linux)上使用SSL支持编译的。我已经尝试过几次重新安装python3和openssl,但这没有用。我没有尝试完全卸载python然后重新安装,因为我还需要在我的计算机上卸载很多其他程序。
Python was compiled with SSL support on my computer (Arch Linux). I've tried reinstalling python3 and openssl a few times, but that doesn't help. I haven't tried to uninstall python completely and then reinstall because I would also need to uninstall a lot of other programs on my computer.
任何人都知道发生了什么?
Anyone know what's going on?
-----编辑-----
-----EDIT-----
我想出来了,感谢来自Andrew Stevlov的回答。我的网址中有一个:,我猜urllib不喜欢这样。我用%3A替换它,现在它正在工作。非常感谢大家!!!
I figured it out, thanks to help from Andrew Stevlov's answer. My url had a ":" in it, and I guess urllib didn't like that. I replaced it with "%3A" and now it's working. Thanks so much guys!!!
推荐答案
仔细检查您的编辑选项,看起来你的盒子有问题。
Double check your compilation options, looks like something is wrong with your box.
至少以下代码适用于我:
At least the following code works for me:
from urllib.request import urlopen
resp = urlopen('https://github.com')
print(resp.read())
这篇关于urllib无法读取https的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!