Python 3.4 url​​lib.request 错误 (http 403) [英] Python 3.4 urllib.request error (http 403)

查看:25
本文介绍了Python 3.4 url​​lib.request 错误 (http 403)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试打开并解析一个 html 页面.在 python 2.7.8 中我没有问题:

I'm trying to open and parse a html page. In python 2.7.8 I have no problem:

import urllib
url = "https://ipdb.at/ip/66.196.116.112"
html = urllib.urlopen(url).read()

一切都很好.但是我想转移到 python 3.4,然后我收到 HTTP 错误 403(禁止).我的代码:

and everything is fine. However I want to move to python 3.4 and there I get HTTP error 403 (Forbidden). My code:

import urllib.request
html = urllib.request.urlopen(url) # same URL as before

File "C:\Python34\lib\urllib\request.py", line 153, in urlopen
return opener.open(url, data, timeout)
File "C:\Python34\lib\urllib\request.py", line 461, in open
response = meth(req, response)
File "C:\Python34\lib\urllib\request.py", line 574, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Python34\lib\urllib\request.py", line 499, in error
return self._call_chain(*args)
File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
result = func(*args)
File "C:\Python34\lib\urllib\request.py", line 582, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

它适用于不使用 https 的其他 URL.

It work for other URLs which don't use https.

url = 'http://www.stopforumspam.com/ipcheck/212.91.188.166'

没问题.

推荐答案

网站似乎不喜欢 Python 3.x 的用户代理.

It seems like the site does not like the user agent of Python 3.x.

指定 User-Agent 将解决您的问题:

Specifying User-Agent will solve your problem:

import urllib.request
req = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
html = urllib.request.urlopen(req).read()

注意 Python 2.x urllib 版本也收到 403 状态,但与 Python 2.x urllib2 和 Python 3.x urllib 不同,它不会引发异常.

NOTE Python 2.x urllib version also receives 403 status, but unlike Python 2.x urllib2 and Python 3.x urllib, it does not raise the exception.

您可以通过以下代码确认:

You can confirm that by following code:

print(urllib.urlopen(url).getcode())  # => 403

这篇关于Python 3.4 url​​lib.request 错误 (http 403)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆