Python - 如何通过 HTTP 代理使用(Urllib2 + SSL)处理 HTTPS 请求 [英] Python - How to handle HTTPS request with (Urllib2 + SSL) though a HTTP proxy
问题描述
我正在尝试使用 urllib2.ProxyHandler 测试代理连接.但是,在某些情况下,我可能会请求 HTTPS 网站(例如:https://www.whatismyip.com/)
I am trying to test a proxy connection by using urllib2.ProxyHandler. However, there probably some situation that I am going to request a HTTPS website (eg: https://www.whatismyip.com/)
如果请求 HTTPS 站点,Urllib2.urlopen() 将抛出 ERROR.所以我尝试使用一个辅助函数来重写 URLOPEN 方法.
Urllib2.urlopen() will throw ERROR if request a HTTPS site. So I tried to use a helper function to rewrite the URLOPEN method.
这是辅助函数:
def urlopen(url, timeout):
if hasattr(ssl, 'SSLContext'):
SslContext = ssl.create_default_context()
SslContext.check_hostname = False
SslContext.verify_mode = ssl.CERT_NONE
return urllib2.urlopen(url, timeout=timeout, context=SslContext)
else:
return urllib2.urlopen(url, timeout=timeout)
这个辅助函数基于answer
然后我使用:
urllib2.install_opener(
urllib2.build_opener(
urllib2.ProxyHandler({'http': '127.0.0.1:8080'})
)
)
为 urllib.opener 设置 http 代理.
to setup http proxy for urllib.opener.
理想情况下,当我使用 urlopen('http://whatismyip.com', 30)
请求网站时,它应该可以工作,并且它应该通过 http 代理传递所有流量.
Ideally, it should working when i request a website by using urlopen('http://whatismyip.com', 30)
and it should pass all traffic through http proxy.
然而,urlopen()
会一直落入 if hasattr(ssl, 'SSLContext')
,即使它是一个 HTTP 站点.此外,HTTPS 站点也没有使用 HTTP 代理.这会导致 HTTP 代理无效并且所有流量都通过未代理的网络
However, the urlopen()
will fall into if hasattr(ssl, 'SSLContext')
all the time even if it is a HTTP site. In addition, HTTPS site is not using HTTP proxy either. This cause the HTTP proxy become invalid and all traffic going through unproxied network
我也试过这个 answer 将 HTTP 更改为 HTTPS urllib2.ProxyHandler({'https': '127.0.0.1:8080'})
但它仍然无法正常工作.
I also tried this answer to change HTTP into HTTPS urllib2.ProxyHandler({'https': '127.0.0.1:8080'})
but it still not working.
我的代理正在运行.如果我使用 urllib2.urlopen()
而不是重写版本 urlopen()
,它适用于 HTTP 站点.
My proxy is working. If i am using urllib2.urlopen()
instead of the rewrite version urlopen()
, it works for HTTP site.
但是,如果 urlopen
需要在仅 HTTPS 的站点上使用,我确实需要考虑这种情况.
But, I do need consider the suitation if the urlopen
gonna need to be used on a HTTPS ONLY site.
怎么做?
谢谢
UPDATE1:我无法在 Python 2.7.11 和某些服务器与 Python 2.7.5 一起正常工作.我认为是python版本问题.
UPDATE1: I cannot get this work with Python 2.7.11 and some of server working properly with Python 2.7.5. I assue it is python version issue.
Urllib2 不会通过 HTTPS 代理,因此所有 HTTPS 网址都将无法使用代理.
Urllib2 will not go through HTTPS Proxy so all HTTPS web address will failed to use proxy.
推荐答案
问题是当您将 context
参数传递给 urllib2.urlopen()
然后 urllib2 创建 opener 本身 而不是 使用 全局的,这是调用 时设置的urllib2.install_opener()
.因此,您打算使用的 ProxyHandler
实例没有被使用.
解决办法不是安装opener,而是直接使用opener.在构建开启器时,您必须同时传递 ProxyHandler
类的实例(为 http 和 https 协议设置代理)和 HTTPSHandler
类的实例(设置https上下文).
The problem is when you pass context
argument to urllib2.urlopen()
then urllib2 creates opener itself instead of using the global one, which is the one that gets set when you call urllib2.install_opener()
. As a result your instance of ProxyHandler
which you meant to be used is not being used.
The solution is not to install opener but to use the opener directly. When building your opener, you have to pass both an instance of your ProxyHandler
class (to set proxies for http and https protocols) and an instance of HTTPSHandler
class (to set https context).
我为此问题创建了https://bugs.python.org/issue29379.
这篇关于Python - 如何通过 HTTP 代理使用(Urllib2 + SSL)处理 HTTPS 请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!