如何在python中编写代理池服务器(当请求到来时,选择一个代理来获取URL内容)? [英] How to write a proxy pool server (when a request comes, choose a proxy to get url content) in python?

查看:91
本文介绍了如何在python中编写代理池服务器(当请求到来时,选择一个代理来获取URL内容)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不知道此类代理服务器的专有名称,欢迎您修复我的问题标题.

I do not know what the proper name is for such proxy server, you're welcome to fix my question title.

当我在Google上搜索代理服务器时,很多实现都像 maproxy a-python-proxy-in-少于100行代码.这些代理服务器似乎只是要求远程服务器获取某个URL地址.

When I search proxy server on google, a lot implements like maproxy or a-python-proxy-in-less-than-100-lines-of-code. Those proxies server seems just ask remote server to get a certain url address.

我想构建一个代理服务器,该服务器包含一个代理池(http/https代理列表),并且只有一个IP地址和一个端口来处理传入的请求.当请求到来时,它将从池中选择一个代理并执行此请求,然后将结果返回.

I want to build a proxy server, which contains a proxy pool(a list of http/https proxies) and only have one IP address and one port to serve incoming requests. When a request comes, it would choose a proxy from the pool and do this request, and return result back.

例如,我有一个IP为192.168.1.66的VPS.我在此VPS上使用IP"127.0.0.1"和端口"8080"启动代理服务器.

For example I have a VPS which IP '192.168.1.66'. I start proxy server at this VPS with IP '127.0.0.1' and port '8080'.

然后我可以像下面那样使用此代理.

I can then use this proxy like below.

import requests
url = 'http://www.google.com'
headers = {
    ...
}
proxies = {
    'http': 'http://192.168.1.66:8080'
}

r = requests.get(url, headers=headers, proxies=proxies)

我看到了一些障碍,例如:

I have see some impelement like:

from twisted.web import proxy, http
from twisted.internet import reactor
from twisted.python import log
import sys
log.startLogging(sys.stdout)

class ProxyFactory(http.HTTPFactory):
    protocol = proxy.Proxy

reactor.listenTCP(8080, ProxyFactory())
reactor.run()

它可以工作,但是它是如此简单,以至于我不知道它如何工作以及如何改进此代码以使用代理池.

It works, but it is so simple that I have no idea how it works and how to improve this code to use a proxy pool.

来自 hidu/proxy-manager ,由golang编写.

from hidu/proxy-manager , which write by golang .

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
+ client (want visit http://www.baidu.com/)              +  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
                        |  
                        |  via proxy 127.0.0.1:8090  
                        |  
                        V  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
+                       +         proxy pool             +  
+ proxy manager listen  ++++++++++++++++++++++++++++++++++  
+ on (127.0.0.1:8090)   +  http_proxy1,http_proxy2,      +  
+                       +  socks5_proxy1,socks5_proxy2   +  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
                        |  
                        |  choose one proxy visit 
                        |  www.baidu.com  
                        |  
                        V  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  
+        site:www.baidu.com                              +  
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++  

推荐答案

您的代理池概念并不难实现. 如果我理解正确,那么您想关注一下.

Your Proxy Pool concept is not hard to implement. If I understand correctly, you want to make following.

  1. 您的代理服务器侦听192.168.1.66:8080上的请求
  2. 客户请求访问 http://www.google.com
  3. 您的代理服务器将CLIENT的请求发送给另一个代理服务器, 在另一个代理服务器-代理池的列表中.
  4. 您的代理服务器从另一个代理服务器获得响应,并响应客户端
  1. YOUR PROXY SERVER listening requests on 192.168.1.66:8080
  2. CLIENT requests to access http://www.google.com
  3. YOUR PROXY SERVER sends CLIENT's request to ANOTHER PROXY SERVER, which is in list of ANOTHER PROXY SERVER - PROXY POOL.
  4. YOUR PROXY SERVER gets response from ANOTHER PROXY SERVER, and respond to CLIENT

因此,我使用 Flask

然后,您可以从这里开始改进代理服务器.

Then, you can start here to improve your proxy server.

公用Proxy PoolProxy Manager可以检查可用性,速度以及其代理的更多统计信息,并选择最佳代理来发送请求.当然,该示例仅处理简单的请求,您可以添加处理请求参数,方法,协议的功能.

Common Proxy Pool, or Proxy Manager can check availability, speed, and more stats of it's proxies, and select best proxy to send request. And of course, this example handle only simple request, and you can add features handle request args, methods, protocols.

希望这对您有所帮助!

这篇关于如何在python中编写代理池服务器(当请求到来时,选择一个代理来获取URL内容)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆