Python扭曲:HTTPS API的反向代理:无法连接 [英] Python-Twisted: Reverse Proxy to HTTPS API: Could not connect
问题描述
我正在尝试建立反向代理以与某些API(例如Twitter,Github,Instagram)进行对话,然后我可以使用反向代理将其调用到所需的任何(客户端)应用程序(将其视为API) -经理).
I am trying to build a reverse-proxy to talk to certain APIs(like Twitter, Github, Instagram) that I can then call with my reverse-proxy to any (client) applications I want (think of it like an API-manager).
此外,我正在使用LXC容器执行此操作.
Also, I am using an LXC-container to do this.
例如,这是我从Twisted Docs上的示例中窃取的最简单的代码:
For example, here is the simplest of code that I hacked from the examples on the Twisted Docs:
from twisted.internet import reactor
from twisted.web import proxy, server
from twisted.python.log import startLogging
from sys import stdout
startLogging(stdout)
site = server.Site(proxy.ReverseProxyResource('https://api.github.com/users/defunkt', 443, b''))
reactor.listenTCP(8080, site)
reactor.run()
当我在容器中执行CURL时,我得到一个有效的请求(意味着我得到了适当的JSON响应).
When I do CURL within the container, I get a valid request (meaning I get the appropriate JSON response).
这是我使用CURL命令的方式:
Here is how I used the CURL command:
curl https://api.github.com/users/defunkt
这是我得到的输出:
{
"login": "defunkt",
"id": 2,
"avatar_url": "https://avatars.githubusercontent.com/u/2?v=3",
"gravatar_id": "",
"url": "https://api.github.com/users/defunkt",
"html_url": "https://github.com/defunkt",
"followers_url": "https://api.github.com/users/defunkt/followers",
"following_url": "https://api.github.com/users/defunkt/following{/other_user}",
"gists_url": "https://api.github.com/users/defunkt/gists{/gist_id}",
"starred_url": "https://api.github.com/users/defunkt/starred{/owner}{/repo}",
"subscriptions_url": "https://api.github.com/users/defunkt/subscriptions",
"organizations_url": "https://api.github.com/users/defunkt/orgs",
"repos_url": "https://api.github.com/users/defunkt/repos",
"events_url": "https://api.github.com/users/defunkt/events{/privacy}",
"received_events_url": "https://api.github.com/users/defunkt/received_events",
"type": "User",
"site_admin": true,
"name": "Chris Wanstrath",
"company": "GitHub",
"blog": "http://chriswanstrath.com/",
"location": "San Francisco",
"email": "chris@github.com",
"hireable": true,
"bio": null,
"public_repos": 107,
"public_gists": 280,
"followers": 15153,
"following": 208,
"created_at": "2007-10-20T05:24:19Z",
"updated_at": "2016-02-26T22:34:27Z"
}
但是,当我尝试使用以下方式通过Firefox提取代理服务器时:
However, when I attempt fetching the proxy via Firefox using:
我得到:无法连接"
这是我的扭曲日志的样子:
This is what my Twisted log looks like:
2016-02-27 [-]日志已打开.
2016-02-27 [-] Log opened.
2016-02-27 [-]网站始于8080
2016-02-27 [-] Site starting on 8080
2016-02-27 [-]出发工厂
2016-02-27 [-] Starting factory
2016-02-27 [-]出发工厂
2016-02-27 [-] Starting factory
2016-02-27 [-]"10.5.5.225"--[27/Feb/2016:+0000]"GET/HTTP/1.1" 501 26-""Mozilla/5.0(X11; Debian; Linux x86_64; rv:44.0)Gecko/20100101 Firefox/44.0"
2016-02-27 [-] "10.5.5.225" - - [27/Feb/2016: +0000] "GET / HTTP/1.1" 501 26 "-" "Mozilla/5.0 (X11; Debian; Linux x86_64; rv:44.0) Gecko/20100101 Firefox/44.0"
2016-02-27 [-]停止工厂
2016-02-27 [-] Stopping factory
如何使用Twisted进行API调用(无论如何,如今大多数API都是HTTPS)并获得所需的响应(基本上,"200"响应/JSON应该是什么)?
How can I use Twisted to make an API call (most APIs are HTTPS nowadays anyway) and get the required response (basically, what the "200" response/JSON should be)?
我尝试着看这个问题:将HTTP代理转换为HTTPS代理在扭曲"中
I tried looking at this question: Convert HTTP Proxy to HTTPS Proxy in Twisted
但是从编码的角度来看(或提及有关反向代理的事情)并没有多大意义.
But it didn't make much sense from a coding point-of-view (or mention anything about reverse-proxying).
**我还尝试使用以下方法将HTTPS API调用切换为常规HTTP调用:
** I also tried switching out the HTTPS API call for a regular HTTP call using:
curl http [冒号] [slash] [slash] openlibrary [dot] org [slash] authors [slash] OL1A.json
curl http[colon][slash][slash]openlibrary[dot]org[slash]authors[slash]OL1A.json
(上面的URL已被格式化以避免链接冲突问题)
(URL above has been formatted to avoid link-conflict issue)
但是,我在浏览器中仍然遇到相同的错误(如上所述).
However, I still get the same error in my browser (as mentioned above).
** Edit2:我曾尝试运行您的代码,但出现此错误:
** I have tried running your code, but I get this error:
如果查看图像,您将看到以下错误(运行代码时):
If you look at the image, you will see the error (when running the code) of:
builtins.AttributeError:'str'对象没有属性'decode'
builtins.AttributeError: 'str' object has no attribute 'decode'
推荐答案
If you read the API documentation for ReverseProxyResource
, you will see that the signature of __init__
is:
def __init__(self, host, port, path, reactor=reactor):
和"host
"被记录为要代理的Web服务器的主机".
and "host
" is documented as "the host of the web server to proxy".
因此,您正在传递Twisted需要主机的URI.
So you are passing a URI where Twisted expects a host.
更糟糕的是,ReverseProxyResource
专为在Web服务器上本地使用而设计,并且完全不支持现成的https://
URL.
Worse yet, ReverseProxyResource
is designed for local use on a web server, and doesn't quite support https://
URLs out of the box.
它确实有一个(非常有限的)可扩展性钩子-proxyClientFactoryClass
-为了对ReverseProxyResource
没有您的需要即表示歉意,我将向您展示如何使用扩展ReverseProxyResource
以添加https://
支持,以便您可以使用GitHub API:).
It does have a (very limited) extensibility hook though - proxyClientFactoryClass
- and to apologize for ReverseProxyResource
not having what you need out of the box, I will show you how to use that to extend ReverseProxyResource
to add https://
support so you can use the GitHub API :).
from twisted.web import proxy, server
from twisted.logger import globalLogBeginner, textFileLogObserver
from twisted.protocols.tls import TLSMemoryBIOFactory
from twisted.internet import ssl, defer, task, endpoints
from sys import stdout
globalLogBeginner.beginLoggingTo([textFileLogObserver(stdout)])
class HTTPSReverseProxyResource(proxy.ReverseProxyResource, object):
def proxyClientFactoryClass(self, *args, **kwargs):
"""
Make all connections using HTTPS.
"""
return TLSMemoryBIOFactory(
ssl.optionsForClientTLS(self.host.decode("ascii")), True,
super(HTTPSReverseProxyResource, self)
.proxyClientFactoryClass(*args, **kwargs))
def getChild(self, path, request):
"""
Ensure that implementation of C{proxyClientFactoryClass} is honored
down the resource chain.
"""
child = super(HTTPSReverseProxyResource, self).getChild(path, request)
return HTTPSReverseProxyResource(child.host, child.port, child.path,
child.reactor)
@task.react
def main(reactor):
import sys
forever = defer.Deferred()
myProxy = HTTPSReverseProxyResource('api.github.com', 443,
b'/users/defunkt')
myProxy.putChild("", myProxy)
site = server.Site(myProxy)
endpoint = endpoints.serverFromString(
reactor,
dict(enumerate(sys.argv)).get(1, "tcp:8080:interface=127.0.0.1")
)
endpoint.listen(site)
return forever
如果运行此命令,则curl http://localhost:8080/
应该会执行您期望的操作.
If you run this, curl http://localhost:8080/
should do what you expect.
我已经自由地对Twisted代码进行了现代化; 端点,而不是listenTCP
, react
而不是自己启动反应堆.
I've taken the liberty of modernizing your Twisted code somewhat; endpoints instead of listenTCP
, logger instead of twisted.python.log
, and react
instead of starting the reactor yourself.
最后一个奇怪的小putChild
片段是因为当我们通过b"/users/defunkt"
作为路径时,这意味着对/
的请求将导致客户端请求/users/defunkt/
(请注意末尾的斜杠),这是GitHub API中的404.如果我们明确地将空子段路径代理为没有尾段,那么我相信它将按照您的期望进行操作.
The weird little putChild
piece at the end there is because when we pass b"/users/defunkt"
as the path, that means a request for /
will result in the client requesting /users/defunkt/
(note the trailing slash), which is a 404 in GitHub's API. If we explicitly proxy the empty-child-segment path as if it did not have the trailing segment, I believe it will do what you expect.
请注意 :从纯文本HTTP到加密HTTPS的代理可能 极其危险 ,因此我在此处添加了仅本地主机的默认侦听接口.如果您的字节是通过实际网络传输的,则应确保已使用TLS对其进行了正确加密.
PLEASE NOTE: proxying from plain-text HTTP to encrypted HTTPS can be extremely dangerous, so I've added a default listening interface here of localhost-only. If your bytes transit over an actual network, you should ensure that they are properly encrypted with TLS.
这篇关于Python扭曲:HTTPS API的反向代理:无法连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!