通过 Python 请求模块发出 HTTP 请求无法通过 curl 的代理工作?为什么? [英] Making HTTP requests via Python Requests module not working via proxy where curl does? Why?

查看:36
本文介绍了通过 Python 请求模块发出 HTTP 请求无法通过 curl 的代理工作?为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用此 curl 命令,我能够从 Bash 获得我正在寻找的响应

curl -v -u z:secret_key --proxy http://proxy.net:80 -H "内容类型:应用程序/json" https://service.com/data.json

我已经看过另一篇关于代理的帖子请求模块

它帮助我用 Python 制定了我的代码,但我需要通过代理发出请求.但是,即使提供了适当的代理,它也不起作用.也许我只是没有看到什么?

<预><代码>>>>requests.request('GET', 'https://service.com/data.json', >>>headers={'Content-Type':'application/json'}, >>>代理 = {'http' : "http://proxy.net:80",'https':'http://proxy.net:80'}, >>>auth=('z', 'secret_key'))

此外,在同一个 python 控制台上,我可以使用 urllib 发出请求,使其成功.

<预><代码>>>>导入 urllib>>>urllib.urlopen("http://www.httpbin.org").read()- -结果 - -

即使尝试仅对非 https 地址进行请求也无法正常工作.

<预><代码>>>>requests.get('http://www.httpbin.org')回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中文件/Library/Python/2.6/site-packages/requests/api.py",第 79 行,在 get返回请求('get', url, **kwargs)文件/Library/Python/2.6/site-packages/requests/api.py",第 66 行,请求预取=预取文件/Library/Python/2.6/site-packages/requests/sessions.py",第 191 行,请求r.send(预取=预取)文件/Library/Python/2.6/site-packages/requests/models.py",第454行,发送引发 ConnectionError(e)requests.exceptions.ConnectionError:超过了 url 的最大重试次数:

Requests 是如此优雅和令人敬畏,但在这种情况下它怎么会失败呢?

解决方案

问题实际上在于 python 的标准 url 访问库 - urllib/urllib2/httplib.我不记得哪个库是确切的罪魁祸首,但为简单起见,我们就将其称为 urllib.不幸的是,urllib 没有实现通过 http(s) 代理访问 https 站点所需的 HTTP Connect 方法.我使用 urllib 添加功能的努力没有成功(自从我尝试以来已经有一段时间了).所以不幸的是,我知道唯一可行的选择是在这种情况下使用 pycurl.

然而,有一个相对干净的解决方案,它与python请求的API几乎完全相同,但它使用pycurl后端而不是python标准库.

该库名为 human_curl.我自己用过,效果很好.

Using this curl command I am able to get the response I am looking for from Bash

curl -v -u z:secret_key --proxy http://proxy.net:80  
-H "Content-Type: application/json" https://service.com/data.json

I have already seen this other post on proxies with the Requests module

And it helped me formulate my code in Python but I need to make a request via a proxy. However, even while supplying the proper proxies it isn't working. Perhaps I'm just not seeing something?

>>> requests.request('GET', 'https://service.com/data.json', 
>>> headers={'Content-Type':'application/json'},  
>>> proxies = {'http' : "http://proxy.net:80",'https':'http://proxy.net:80'}, 
>>> auth=('z', 'secret_key'))

Furthermore, at the same python console I can use urllib to make a request have it be successful.

>>> import urllib
>>> urllib.urlopen("http://www.httpbin.org").read()
---results---

Even trying requests on just a non-https address fails to work.

>>> requests.get('http://www.httpbin.org')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.6/site-packages/requests/api.py", line 79, in get
   return request('get', url, **kwargs)
File "/Library/Python/2.6/site-packages/requests/api.py", line 66, in request
    prefetch=prefetch
File "/Library/Python/2.6/site-packages/requests/sessions.py", line 191, in request
    r.send(prefetch=prefetch)
File "/Library/Python/2.6/site-packages/requests/models.py", line 454, in send
    raise ConnectionError(e)
requests.exceptions.ConnectionError: Max retries exceeded for url:

Requests is so elegant and awesome but how could it be failing in this instance?

解决方案

The problem actually lies with python's standard url access libraries - urllib/urllib2/httplib. I can't remember which library is the exact culprit, but for simplicity's sake, let's just call it urllib. Unfortunately, urllib doesn't implement the HTTP Connect method which is required for accessing an https site through an http(s) proxy. My efforts to add the functionality using urllib have not been successful (it has been a while since I tried). So unfortunately the only option I know to work is to use pycurl for this case.

However, there is a solution which is relatively clean that is almost exactly the same API as python requests, but it uses a pycurl backend instead of the python standard libraries.

The library is called human_curl. I've used it myself and have had great results.

这篇关于通过 Python 请求模块发出 HTTP 请求无法通过 curl 的代理工作?为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆