如何使用数字海洋DNS和Nginx设置全局负载平衡? [英] How to setup global load balancing using Digital Ocean DNS and Nginx?

查看:149
本文介绍了如何使用数字海洋DNS和Nginx设置全局负载平衡?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


更新:看到我以下提供的答案,我最终在AWS上设置了解决方案。


我正在尝试如何在数字海洋上为我的应用服务器最佳实施全局负载平衡层的方法,还有几件我尚未组合。 / p>

目标



通过路由所有连接为我的用户提供高度可用的服务到SFO,NYC,LON和最终新加坡的最接近的集群服务器。



此外,我最终会通过编写一个守护进程来自动维护可以监视,缩放和修复系统上的任何服务器。或者我将结合各种服务来实现相同的自动化目标。首先我需要弄清楚如何手动执行。



堆栈


  1. Ubuntu 14.04

  2. Nginx 1.4.6

  3. node.js

  4. MongoDB从 Compose.io (以前称为MongoHQ)

全球域名细分



一旦我完成所有操作,我的域名将如下所示:

  ** GLOBAL ** 
global-balancing-1.myapp.com
global-balancing-2.myapp.com
global-balancing-3.myapp.com

** NYC **
nyc-load-balancing-1.myapp.com
nyc-load-balancing-2.myapp .com
nyc-load-balancing-3.myapp.com

nyc-app-1.myapp.com
nyc-app-2.myapp.com
nyc-app-3.myapp.com

nyc-api-1.myapp.com
nyc-api-2.myapp.com
nyc-api-3。 myapp.com

** SFO **
sfo-load-balancing-1.myapp.com
sfo-load-balancing-2.myapp.com
sfo-load-balancing-3.myapp.com

sfo-app-1.myapp.com
sfo-app-2.myapp.com
sfo-app-3.myapp.com

sfo-api-1。 myapp.com
sfo-api-2.myapp.com
sfo-api-3.myapp.com

** LON **
lon-load- balance-1.myapp.com
lon-load-balancing-2.myapp.com
lon-load-balancing-3.myapp.com

lon-app-1 .myapp.com
lon-app-2.myapp.com
lon-app-3.myapp.com

lon-api-1.myapp.com
lon-api-2.myapp.com
lon-api-3.myapp.com

然后,如果任何给定的层有任何压力,在任何给定的区域,我只需要启动一个新的液滴来帮助: nyc-app-4.myapp.com lon-load-balancing-5.myapp.com 等...



当前工作方法




  • 全局平衡服务器接收所有流量。
    这些服务器是 DNS Round-Robin 平衡,如
    (坦率的混淆)文章所示:如何配置DNS循环加载
    平衡


  • 使用 Nginx GeoIP
    模块

    MaxMind GeoIP数据
    任何给定请求的来源被确定为
    $ geoip_city_continent_code


  • global-balancing 层然后将请求路由到至少
    连接的
    服务器在负载平衡相应的
    集群的层: nyc-load-balancing-1 sfo-load-balancing-3
    lon-load-balancing-2 等。这一层也是(b)最低ets。


  • 区域负载平衡层然后将请求路由到
    最小连接服务器在应用程序或api层: nyc-app-2
    sfo-api-1 lon-api-3 等...




Nginx功夫的详细信息可以在本教程中找到:
Villiage Idiot:在
AWS
上使用GSLB /反向代理设置Nginx。有关Nginx负载均衡的更多一般信息可用
此处

这里



问题



strong>我在哪里放置全局平衡服务器?



它使我像奇怪我会将它们全部放在一个地方,或者将这一层传播到全球各地。例如,我把它们全部放在纽约市。那么来自法国的人来到我的域名。该请求将从法国到纽约市,然后被送回LON。或者如果我把其中一个放在SFO,NYC和LON中,那么来自多伦多(Parkdale,代表)的用户是否仍然可能发送一个最终到LON的请求才能被路由回纽约? / p>

后续请求是否路由到相同的IP?



如同,if来自多伦多的用户发送一个请求,即全局平衡层决定应该去纽约,下一个请求是否直接来到纽约市场,还是运气的绘制,它将击中最接近的全局平衡服务器(在这种情况下为NYC)。



会话如何?



我已配置 Nginx 使用ip_hash; 指令,因此它将引导用户使用相同的应用程序 api 端点(在我的情况下是一个节点进程),但是全局平衡如何影响到这一点? >

任何DNS示例?



我不是一个DNS专家(我是目前正在试图找出为什么我的CNAME记录不能解决),但是当我提供了一个坚实的例子,我是一个快速的研究。有没有人经历过这个过程,可以提供DNS记录的样本,以便成功安装?



SSL / TLS怎么样? strong>



我需要每个服务器的证书,或仅需要三个全局平衡服务器,因为这样唯一的面向公众的网关?



如果你读这个整个事情,然后用蛋糕奖励自己。感谢您的任何帮助。

解决方案


目标:为我的用户提供高度可用的服务通过将所有连接路由到SFO,NYC,LON和最终新加坡最接近的集群服务器。



然后,全局平衡层将请求路由至最低
连接的服务器...


如果我正确阅读配置,您实际上是从您的全球平衡机代理每个地区的平衡者。这不符合您将用户路由到最近的地区的目标。



有三种方式可以获取您要查找的内容:


  1. 30x重定向
    您的全局平衡器收到HTTP请求,然后将其重定向到服务器组或根据IP地址认为请求来自的地区附近。这听起来像你想要设置的。这种方法对某些应用程序有副作用,并且还增加了用户从添加大量开销获取数据所需的时间。如果您重定向的资源非常大,而且本地区域集群将能够更高效地服务,这仅仅是有意义的。


  2. Anycast(利用BGP路由)
    这就像Akamai用于他们的CDN的大的玩家。基本上,互联网上有多个服务器,具有完全相同的可路由IP地址。假设我有几个地区的服务器,并且IP地址为192.0.2.1。如果我在美国,尝试连接到192.0.2.1,有人正在欧洲尝试连接到192.0.2.1,那么我们很可能会被路由到最近的服务器。这使用互联网自己的路由来查找流量的最佳路径(基于网络条件)。不幸的是,你不能只使用这种方法。您需要自己的AS号码和物理硬件。如果您发现一个VPS提供商可以让您拥有自己的Anycast区块,请告诉我!


  3. 地理DNS 有些DNS提供商通常以Geo-DNS的形式提供服务。他们有一堆DNS服务器托管在任意地址,可以将流量路由到最近的服务器。如果客户端查询欧洲DNS服务器,它应该返回您的欧洲地区服务器的地址,而其他地区的地址。 Geo DNS服务有很多变体。其他人只需维护一个地理IP数据库,并返回他们认为更接近的区域的服务器,就像重定向方法一样,但是在HTTP请求之前是DNS。这通常是一个很好的选择,因为价格和易用性。





请求被路由到相同的IP?


许多负载平衡器都有一个粘性选项,表示来自同一个网络地址的请求应该是路由到同一个终端服务器(前提是终端服务器仍然运行)。


会话如何?


这就是为什么你会想要这种粘性。当谈到会话数据时,您将需要找到一种方法来保持所有的服务器是最新的。实际上,这并不总是得到保证。如何处理它取决于您的应用程序。你可以保留一个Redis的实例,还有什么可以让所有的服务器可靠地从世界各地得到?你真的需要每个地区的会话数据吗?或者你可以让主应用程序服务器在一个位置处理会话数据?


任何DNS示例?


发布单独的问题。每个人的成功设置看起来都不一样。


SSL / TLS如何?


如果您正在代理数据,只有您的全局平衡器需要处理HTTPS。如果您正在重定向,则所有服务器都需要处理。


UPDATE: See the answer I've provided below for the solution I eventually got set up on AWS.

I'm currently experimenting with methodologies on how to best implement a global load-balancing layer for my app servers on Digital Ocean and there's a few pieces I've yet to put together.

The Goal

Offer highly-available service to my users by routing all connections to the closest 'cluster' of servers in SFO, NYC, LON, and eventually Singapore.

Additionally, I would eventually like to automate the maintenance of this by writing a daemon that can monitor, scale, and heal any of the servers on the system. Or I'll combine various services to achieve the same automation goals. First I need to figure out how to do it manually.

The Stack

  1. Ubuntu 14.04
  2. Nginx 1.4.6
  3. node.js
  4. MongoDB from Compose.io (formerly MongoHQ)

Global Domain Breakdown

Once I rig everything up, my domain would look something like this:

**GLOBAL**
global-balancing-1.myapp.com
global-balancing-2.myapp.com
global-balancing-3.myapp.com

**NYC**
nyc-load-balancing-1.myapp.com
nyc-load-balancing-2.myapp.com
nyc-load-balancing-3.myapp.com

nyc-app-1.myapp.com
nyc-app-2.myapp.com
nyc-app-3.myapp.com

nyc-api-1.myapp.com
nyc-api-2.myapp.com
nyc-api-3.myapp.com

**SFO**
sfo-load-balancing-1.myapp.com
sfo-load-balancing-2.myapp.com
sfo-load-balancing-3.myapp.com

sfo-app-1.myapp.com
sfo-app-2.myapp.com
sfo-app-3.myapp.com

sfo-api-1.myapp.com
sfo-api-2.myapp.com
sfo-api-3.myapp.com

**LON**
lon-load-balancing-1.myapp.com
lon-load-balancing-2.myapp.com
lon-load-balancing-3.myapp.com

lon-app-1.myapp.com
lon-app-2.myapp.com
lon-app-3.myapp.com

lon-api-1.myapp.com
lon-api-2.myapp.com
lon-api-3.myapp.com

And then if there's any strain on any given layer, in any given region, I can just spin up a new droplet to help out: nyc-app-4.myapp.com, lon-load-balancing-5.myapp.com, etc…

Current Working Methodology

  • A (minimum) trio of global-balancing servers receive all traffic. These servers are DNS Round-Robin balanced as illustrated in this (frankly confusing) article: How To Configure DNS Round-Robin Load Balancing.

  • Using the Nginx GeoIP Module and MaxMind GeoIP Data the origin of any given request is determined down to the $geoip_city_continent_code.

  • The global-balancing layer then routes the request to theleast connected server on the load-balancing layer of the appropriate cluster: nyc-load-balancing-1, sfo-load-balancing-3, lon-load-balancing-2, etc.. This layer is also a (minimum) trio of droplets.

  • The regional load-balancing layer then routes the request to the least connected server in the app or api layer: nyc-app-2, sfo-api-1, lon-api-3, etc…

The details of the Nginx kung-fu can be found in this tutorial: Villiage Idiot: Setting up Nginx with GSLB/Reverse Proxy on AWS. More general info about Nginx load-balancing is available here and here.

Questions

Where do a I put the global-balancing servers?

It strikes me as odd that I would put them either all in one place, or spread that layer out around the globe either. Say, for instance, I put them all in NYC. Then someone from France hits my domain. The request would go from France, to NYC, and then be routed back to LON. Or if I put one of each in SFO, NYC, and LON then isn't it still possible that a user from Toronto (Parkdale, represent) could send a request that ends up going to LON only to be routed back to NYC?

Do subsequent requests get routed to the same IP?

As in, if a user from Toronto sends a request that the global-balancing layer determines should be going to NYC, does the next request from that origin go directly to NYC, or is it still luck of the draw that it will hit the nearest global-balancing server (NYC in this case).

What about sessions?

I've configured Nginx to use the ip_hash; directive so it will direct the user to the same app or api endpoint (a node process, in my case) but how will global balancing affect this, if at all?

Any DNS Examples?

I'm not exactly a DNS expert (I'm currently trying to figure out why my CNAME records aren't resolving) but I'm a quick study when provided with a solid example. Has anyone gone through this process before and can provide a sample of what the DNS records look like for a successful setup?

What about SSL/TLS?

Would I need a certificate for every server, or just for the three global-balancing servers since that's the only public-facing gateway?

If you read this whole thing then reward yourself with a cupcake. Thanks in advance for any help.

解决方案

The Goal: Offer highly-available service to my users by routing all connections to the closest 'cluster' of servers in SFO, NYC, LON, and eventually Singapore.

The global-balancing layer then routes the request to theleast connected server...

If I'm reading your configuration correctly, you're actually proxying from your global balancers to the balancers at each region. This does not meet your goal of routing users to the nearest region.

There are three ways that I know of to get what you're looking for:

  1. 30x Redirect
    Your global balancers receive the HTTP request and then redirect it to a server group in or near the region it thinks the request is coming from, based on IP address. This sounds like what you were trying to set up. This method has side effects for some applications, and also increases the time it takes for a user to get data since you're adding a ton of overhead. This only makes sense if the resources you're redirecting to are very large, and the local regional cluster will be able to serve much more efficiently.

  2. Anycast (taking advantage of BGP routing)
    This is what the big players like Akamai use for their CDN. Basically, there are multiple servers out on the internet with the exact same routable IP address. Suppose I have servers in several regions, and they have the IP address of 192.0.2.1. If I'm in the US and try to connect to 192.0.2.1, and someone is in Europe that tries to connect to 192.0.2.1, it's likely that we'll be routed to the nearest server. This uses the internet's own routing to find the best path (based on network conditions) for the traffic. Unfortunately, you can't just use this method. You need your own AS number, and physical hardware. If you find a VPS provider that lets you have a chunk of their Anycast block, let me know!

  3. Geo-DNS
    There are some DNS providers that provide a service often marketed as "Geo-DNS". They have a bunch of DNS servers hosted on anycast addresses which can route traffic to your nearest servers. If a client queries a European DNS server, it should return the address for your European region servers, vs. some in other regions. There are many variations on the Geo DNS services. Others simply maintain a geo-IP database and return the server for the region they think is closer, just like the redirect method but for DNS before the HTTP request is ever made. This is usually the good option, for price and ease of use.

Do subsequent requests get routed to the same IP?

Many load balancers have a "stickiness" option that says requests from the same network address should be routed to the same end server (provided that end server is still up and running).

What about sessions?

This is exactly why you would want that stickiness. When it comes to session data, you are going to have to find a way to keep all your servers up-to-date. Realistically, this isn't always guaranteed. How you handle it depends on your application. Can you keep a Redis instance or whatever out there for all your servers to reliably hit from around the world? Do you really need that session data in every region? Or can you have your main application servers dealing with session data in one location?

Any DNS Examples?

Post separate questions for these. Everyone's "successful setup" looks differently.

What about SSL/TLS?

If you're proxying data, only your global balancers need to handle HTTPS. If you're redirecting, then all the servers need to handle it.

这篇关于如何使用数字海洋DNS和Nginx设置全局负载平衡?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆