通过Python进行HTTP请求请求模块无法通过代理在curl执行的地方工作?为什么? [英] Making HTTP requests via Python Requests module not working via proxy where curl does? Why?

查看:355
本文介绍了通过Python进行HTTP请求请求模块无法通过代理在curl执行的地方工作?为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用这个curl命令我能够得到我从Bash寻找的响应

  curl -v -uz: secret_key --proxy http://proxy.net:80 \ 
-HContent-Type:application / jsonhttps://service.com/data.json

我已经在Requests模块中看到了有关代理的其他文章



它帮助我在Python中编写代码,但是我需要通过代理进行请求。但是,即使在提供正确的代理服务器时,它也不起作用。也许我只是没有看到什么?

 >>> requests.request('GET','https://service.com/data.json',\ 
>>> headers = {'Content-Type':'application / json'}, \
>>>代理= {'http':http://proxy.net:80\",'https':'http://proxy.net:80'},\
>>> auth =('z','secret_key'))

另外,在同一个python控制台上,我可以使用urllib来让请求成功。

 >>> ;导入urllib 
>>> urllib.urlopen(http://www.httpbin.org).read()
---结果---

即使仅通过非https地址尝试请求也无法正常工作。

 > >> requests.get('http://www.httpbin.org')
Traceback(最近一次调用最后一次):
在< module>中,第1行的文件< stdin>
文件/Library/Python/2.6/site-packages/requests/api.py,第79行,获取
返回请求('get',url,** kwargs)
文件/Library/Python/2.6/site-packages/requests/api.py,第66行,请求
prefetch = prefetch
文件/Library/Python/2.6/site-packages/requests /sessions/python/2.6/site-packages/requests/models.py,第454行中的/sessions.py,第191行,请求
r.send(prefetch = prefetch)
文件/Library/Python/2.6/site-packages/requests/models.py在发送
的情况下引发ConnectionError(e)
requests.exceptions.ConnectionError:超过URL的最大重试次数:

请求是如此的优雅和真棒,但在这种情况下它怎么会失败呢?

解决方案

这个问题其实在于python的标准url访问库 - urllib / urllib2 / httplib。我不记得哪个库是确切的罪魁祸首,但为了简单起见,我们只是称它为urllib。不幸的是,urllib并未实现通过http代理访问https站点所需的HTTP连接方法。我使用urllib添加功能的努力还没有成功(我尝试过了一段时间)。所以不幸的是,我知道工作的唯一选择是在这种情况下使用pycurl。然而,有一个相对干净的解决方案,它几乎与API完全相同python请求,但它使用pycurl后端而不是python标准库。

这个库叫做 human_curl 。我自己使用过,效果很好。


Using this curl command I am able to get the response I am looking for from Bash

curl -v -u z:secret_key --proxy http://proxy.net:80  \
-H "Content-Type: application/json" https://service.com/data.json

I have already seen this other post on proxies with the Requests module

And it helped me formulate my code in Python but I need to make a request via a proxy. However, even while supplying the proper proxies it isn't working. Perhaps I'm just not seeing something?

>>> requests.request('GET', 'https://service.com/data.json', \
>>> headers={'Content-Type':'application/json'}, \ 
>>> proxies = {'http' : "http://proxy.net:80",'https':'http://proxy.net:80'}, \
>>> auth=('z', 'secret_key'))

Furthermore, at the same python console I can use urllib to make a request have it be successful.

>>> import urllib
>>> urllib.urlopen("http://www.httpbin.org").read()
---results---

Even trying requests on just a non-https address fails to work.

>>> requests.get('http://www.httpbin.org')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.6/site-packages/requests/api.py", line 79, in get
   return request('get', url, **kwargs)
File "/Library/Python/2.6/site-packages/requests/api.py", line 66, in request
    prefetch=prefetch
File "/Library/Python/2.6/site-packages/requests/sessions.py", line 191, in request
    r.send(prefetch=prefetch)
File "/Library/Python/2.6/site-packages/requests/models.py", line 454, in send
    raise ConnectionError(e)
requests.exceptions.ConnectionError: Max retries exceeded for url:

Requests is so elegant and awesome but how could it be failing in this instance?

解决方案

The problem actually lies with python's standard url access libraries - urllib/urllib2/httplib. I can't remember which library is the exact culprit, but for simplicity's sake, let's just call it urllib. Unfortunately, urllib doesn't implement the HTTP Connect method which is required for accessing an https site through an http(s) proxy. My efforts to add the functionality using urllib have not been successful (it has been a while since I tried). So unfortunately the only option I know to work is to use pycurl for this case.

However, there is a solution which is relatively clean that is almost exactly the same API as python requests, but it uses a pycurl backend instead of the python standard libraries.

The library is called human_curl. I've used it myself and have had great results.

这篇关于通过Python进行HTTP请求请求模块无法通过代理在curl执行的地方工作?为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆