Python - 如何通过 HTTP 代理使用(Urllib2 + SSL)处理 HTTPS 请求 [英] Python - How to handle HTTPS request with (Urllib2 + SSL) though a HTTP proxy

查看:60
本文介绍了Python - 如何通过 HTTP 代理使用(Urllib2 + SSL)处理 HTTPS 请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 urllib2.ProxyHandler 测试代理连接.但是,在某些情况下,我可能会请求 HTTPS 网站(例如:https://www.whatismyip.com/)

I am trying to test a proxy connection by using urllib2.ProxyHandler. However, there probably some situation that I am going to request a HTTPS website (eg: https://www.whatismyip.com/)

如果请求 HTTPS 站点,Urllib2.urlopen() 将抛出 ERROR.所以我尝试使用一个辅助函数来重写 URLOPEN 方法.

Urllib2.urlopen() will throw ERROR if request a HTTPS site. So I tried to use a helper function to rewrite the URLOPEN method.

这是辅助函数:

def urlopen(url, timeout):
    if hasattr(ssl, 'SSLContext'):
        SslContext = ssl.create_default_context()
        SslContext.check_hostname = False
        SslContext.verify_mode = ssl.CERT_NONE
        return urllib2.urlopen(url, timeout=timeout, context=SslContext)
    else:
        return urllib2.urlopen(url, timeout=timeout)

这个辅助函数基于answer

然后我使用:

urllib2.install_opener(
     urllib2.build_opener(
         urllib2.ProxyHandler({'http': '127.0.0.1:8080'})
     )
)

为 urllib.opener 设置 http 代理.

to setup http proxy for urllib.opener.

理想情况下,当我使用 urlopen('http://whatismyip.com', 30) 请求网站时,它应该可以工作,并且它应该通过 http 代理传递所有流量.

Ideally, it should working when i request a website by using urlopen('http://whatismyip.com', 30) and it should pass all traffic through http proxy.

然而,urlopen() 会一直落入 if hasattr(ssl, 'SSLContext') ,即使它是一个 HTTP 站点.此外,HTTPS 站点也没有使用 HTTP 代理.这会导致 HTTP 代理无效并且所有流量都通过未代理的网络

However, the urlopen() will fall into if hasattr(ssl, 'SSLContext') all the time even if it is a HTTP site. In addition, HTTPS site is not using HTTP proxy either. This cause the HTTP proxy become invalid and all traffic going through unproxied network

我也试过这个 answer 将 HTTP 更改为 HTTPS urllib2.ProxyHandler({'https': '127.0.0.1:8080'}) 但它仍然无法正常工作.

I also tried this answer to change HTTP into HTTPS urllib2.ProxyHandler({'https': '127.0.0.1:8080'}) but it still not working.

我的代理正在运行.如果我使用 urllib2.urlopen() 而不是重写版本 urlopen(),它适用于 HTTP 站点.

My proxy is working. If i am using urllib2.urlopen() instead of the rewrite version urlopen(), it works for HTTP site.

但是,如果 urlopen 需要在仅 HTTPS 的站点上使用,我确实需要考虑这种情况.

But, I do need consider the suitation if the urlopen gonna need to be used on a HTTPS ONLY site.

怎么做?

谢谢

UPDATE1:我无法在 Python 2.7.11 和某些服务器与 Python 2.7.5 一起正常工作.我认为是python版本问题.

UPDATE1: I cannot get this work with Python 2.7.11 and some of server working properly with Python 2.7.5. I assue it is python version issue.

Urllib2 不会通过 HTTPS 代理,因此所有 HTTPS 网址都将无法使用代理.

Urllib2 will not go through HTTPS Proxy so all HTTPS web address will failed to use proxy.

推荐答案

问题是当您将 context 参数传递给 urllib2.urlopen() 然后 urllib2 创建 opener 本身 而不是 使用 全局的,这是调用 时设置的urllib2.install_opener().因此,您打算使用的 ProxyHandler 实例没有被使用.
解决办法不是安装opener,而是直接使用opener.在构建开启器时,您必须同时传递 ProxyHandler 类的实例(为 http 和 https 协议设置代理)和 HTTPSHandler 类的实例(设置https上下文).

The problem is when you pass context argument to urllib2.urlopen() then urllib2 creates opener itself instead of using the global one, which is the one that gets set when you call urllib2.install_opener(). As a result your instance of ProxyHandler which you meant to be used is not being used.
The solution is not to install opener but to use the opener directly. When building your opener, you have to pass both an instance of your ProxyHandler class (to set proxies for http and https protocols) and an instance of HTTPSHandler class (to set https context).

我为此问题创建了https://bugs.python.org/issue29379.

这篇关于Python - 如何通过 HTTP 代理使用(Urllib2 + SSL)处理 HTTPS 请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆