如何使用 Digital Ocean DNS 和 Nginx 设置全局负载均衡? [英] How do I set up global load balancing using Digital Ocean DNS and Nginx?

查看:18
本文介绍了如何使用 Digital Ocean DNS 和 Nginx 设置全局负载均衡?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

<块引用>

更新:请参阅我在下面提供的答案,了解我最终在 AWS 上设置的解决方案.

我目前正在试验为我在 Digital Ocean 上的应用服务器实现全局负载平衡层的方法,还有一些我还没有完成.

目标

通过将所有连接路由到最近的 SFO、NYC、LON 和最终新加坡服务器集群",为我的用户提供高度可用的服务.

此外,我最终希望通过编写一个可以监控、扩展和修复系统上任何服务器的守护进程来自动维护这个.或者我将结合各种服务来实现相同的自动化目标.首先,我需要弄清楚如何手动完成.

堆栈

  1. Ubuntu 14.04
  2. Nginx 1.4.6
  3. node.js
  4. 来自 Compose.io(原 MongoHQ)的 MongoDB

全球域名细分

一旦我把所有东西都装好,我的域就会看起来像这样:

**全球**global-balancing-1.myapp.comglobal-balancing-2.myapp.comglobal-balancing-3.myapp.com**纽约**nyc-load-balancing-1.myapp.comnyc-load-balancing-2.myapp.comnyc-load-balancing-3.myapp.comnyc-app-1.myapp.comnyc-app-2.myapp.comnyc-app-3.myapp.comnyc-api-1.myapp.comnyc-api-2.myapp.comnyc-api-3.myapp.com**SFO**sfo-load-balancing-1.myapp.comsfo-load-balancing-2.myapp.comsfo-load-balancing-3.myapp.comsfo-app-1.myapp.comsfo-app-2.myapp.comsfo-app-3.myapp.comsfo-api-1.myapp.comsfo-api-2.myapp.comsfo-api-3.myapp.com**伦敦**lon-load-balancing-1.myapp.comlon-load-balancing-2.myapp.comlon-load-balancing-3.myapp.comlon-app-1.myapp.comlon-app-2.myapp.comlon-app-3.myapp.comlon-api-1.myapp.comlon-api-2.myapp.comlon-api-3.myapp.com

然后,如果任何给定的层、任何给定的区域有任何压力,我可以旋转一个新的液滴来帮助解决:nyc-app-4.myapp.com, lon-load-balancing-5.myapp.com 等……

目前的工作方法

  • 一个(最少)三个 global-balancing 服务器接收所有流量.这些服务器是DNS Round-Robin"平衡的,如图所示(坦率地说令人困惑)文章:如何配置 DNS 循环加载平衡.

  • 使用 Nginx GeoIP模块MaxMind GeoIP 数据任何给定请求的来源都被确定为$geoip_city_continent_code.

  • global-balancing 层然后将请求路由到最少的适当的 load-balancing 层上连接的服务器集群:nyc-load-balancing-1sfo-load-balancing-3lon-load-balancing-2 等.这一层也是(最少)三个水滴.

  • 区域 load-balancing 层然后将请求路由到应用程序或 api 层中最少连接的服务器:nyc-app-2sfo-api-1lon-api-3 等等……

Nginx 功夫的详细内容可以在本教程中找到:Villiage Idiot:设置 NginxGSLB/反向代理开启AWS.有关 Nginx 负载平衡的更多一般信息可用这里这里.

问题

我应该把 global-balancing 服务器放在哪里?

让我感到奇怪的是,我要么将它们全部放在一个地方,要么将这一层散布到全球各地.比如说,我把它们都放在纽约.然后来自法国的某人访问了我的域.请求将从法国发送到纽约,然后被路由回 LON.或者,如果我将其中一个放在 SFO、NYC 和 LON 中,那么来自多伦多(Parkdale,代表)的用户是否仍然有可能发送最终转到 LON 的请求,但最终会被路由回纽约?

后续请求是否路由到同一个 IP?

例如,如果来自多伦多的用户发送了 global-balancing 层确定应该去纽约的请求,那么来自该来源的下一个请求是直接去纽约,还是仍然幸运的是,它会命中最近的 global-balancing 服务器(在本例中为 NYC).

会话呢?

我已将 Nginx 配置为使用 ip_hash; 指令,因此它会将用户引导到相同的 appapi 端点(在我的情况下是一个节点进程),但全局平衡将如何影响这一点,如果有的话?

任何 DNS 示例?

我不完全是 DNS 专家(我目前正在尝试弄清楚为什么我的 CNAME 记录无法解析),但是当我提供一个可靠的示例时,我可以快速学习.有没有人以前经历过这个过程,并且可以提供一个示例,说明成功设置的 DNS 记录是什么样的?

SSL/TLS 怎么样?

我是否需要为每台服务器提供证书,还是仅用于三个 global-balancing 服务器,因为那是唯一面向公众的网关?

如果您阅读了整篇文章,请用纸杯蛋糕奖励自己.在此先感谢您的帮助.

解决方案

目标:通过将所有连接路由到 SFO、NYC、LON 和最终新加坡的最近服务器集群",为我的用户提供高可用性服务.

全局平衡层然后将请求路由到最小连接的服务器...

如果我正确读取了您的配置,那么您实际上是从全局平衡器代理到每个区域的平衡器.这不符合您将用户路由到最近区域的目标.

我知道可以通过三种方式来获取您要查找的内容:

  1. 30x 重定向
    您的全局平衡器接收 HTTP 请求,然后根据 IP 将其重定向到它认为请求来自的区域内或附近的服务器组地址.这听起来像你试图设置的.此方法对某些应用程序有副作用,并且还会增加用户获取数据所需的时间,因为您增加了大量开销.这仅在您重定向到的资源非常大时才有意义,并且本地区域集群将能够更有效地提供服务.

  2. 任播(利用 BGP 路由)
    这就是 Akamai 等大公司用于其 CDN 的内容.基本上,互联网上有多个服务器具有完全相同的可路由 IP 地址.假设我在多个地区有服务器,它们的 IP 地址为 192.0.2.1.如果我在美国并尝试连接到 192.0.2.1,而有人在欧洲尝试连接到 192.0.2.1,我们很可能会被路由到最近的服务器.这使用互联网自己的路由来为流量找到最佳路径(基于网络条件).不幸的是,您不能只使用这种方法.您需要自己的 AS 编号和物理硬件.如果您发现 VPS 提供商可以让您拥有他们的 Anycast 块的一大块,请告诉我!

  3. Geo-DNS
    有一些 DNS 提供商提供的服务通常被称为Geo-DNS".他们有一堆 DNS 服务器托管在任播地址上,可以将流量路由到最近的服务器.如果客户端查询欧洲 DNS 服务器,它应该返回您的欧洲地区服务器的地址,而不是其他地区的一些.Geo DNS 服务有许多变体.其他人只是维护一个地理 IP 数据库并返回他们认为更近的区域的服务器,就像重定向方法一样,但在 HTTP 请求发出之前用于 DNS.考虑到价格和易用性,这通常是不错的选择.

<块引用>

后续请求是否路由到同一个 IP?

许多负载均衡器具有粘性"表示来自同一网络地址的请求应路由到同一终端服务器的选项(前提是终端服务器仍在运行).

<块引用>

会议怎么样?

这正是您想要这种粘性的原因.当谈到会话数据时,您将不得不找到一种方法来使所有服务器保持最新状态.实际上,这并不总是有保证的.您如何处理它取决于您的应用程序.您能否保留一个 Redis 实例或其他任何东西,以便您的所有服务器都能可靠地从世界各地访问?您真的需要每个地区的会话数据吗?或者您能否让您的主要应用服务器在一个位置处理会话数据?

<块引用>

任何 DNS 示例?

针对这些发布单独的问题.每个人的成功设置"看起来不一样.

<块引用>

SSL/TLS 怎么样?

如果您要代理数据,则只有您的全局平衡器需要处理 HTTPS.如果您要重定向,则所有服务器都需要处理它.

UPDATE: See the answer I've provided below for the solution I eventually got set up on AWS.

I'm currently experimenting with methods to implement a global load-balancing layer for my app servers on Digital Ocean and there's a few pieces I've yet to put together.

The Goal

Offer highly-available service to my users by routing all connections to the closest 'cluster' of servers in SFO, NYC, LON, and eventually Singapore.

Additionally, I would eventually like to automate the maintenance of this by writing a daemon that can monitor, scale, and heal any of the servers on the system. Or I'll combine various services to achieve the same automation goals. First I need to figure out how to do it manually.

The Stack

  1. Ubuntu 14.04
  2. Nginx 1.4.6
  3. node.js
  4. MongoDB from Compose.io (formerly MongoHQ)

Global Domain Breakdown

Once I rig everything up, my domain would look something like this:

**GLOBAL**
global-balancing-1.myapp.com
global-balancing-2.myapp.com
global-balancing-3.myapp.com

**NYC**
nyc-load-balancing-1.myapp.com
nyc-load-balancing-2.myapp.com
nyc-load-balancing-3.myapp.com

nyc-app-1.myapp.com
nyc-app-2.myapp.com
nyc-app-3.myapp.com

nyc-api-1.myapp.com
nyc-api-2.myapp.com
nyc-api-3.myapp.com

**SFO**
sfo-load-balancing-1.myapp.com
sfo-load-balancing-2.myapp.com
sfo-load-balancing-3.myapp.com

sfo-app-1.myapp.com
sfo-app-2.myapp.com
sfo-app-3.myapp.com

sfo-api-1.myapp.com
sfo-api-2.myapp.com
sfo-api-3.myapp.com

**LON**
lon-load-balancing-1.myapp.com
lon-load-balancing-2.myapp.com
lon-load-balancing-3.myapp.com

lon-app-1.myapp.com
lon-app-2.myapp.com
lon-app-3.myapp.com

lon-api-1.myapp.com
lon-api-2.myapp.com
lon-api-3.myapp.com

And then if there's any strain on any given layer, in any given region, I can just spin up a new droplet to help out: nyc-app-4.myapp.com, lon-load-balancing-5.myapp.com, etc…

Current Working Methodology

  • A (minimum) trio of global-balancing servers receive all traffic. These servers are "DNS Round-Robin" balanced as illustrated in this (frankly confusing) article: How To Configure DNS Round-Robin Load Balancing.

  • Using the Nginx GeoIP Module and MaxMind GeoIP Data the origin of any given request is determined down to the $geoip_city_continent_code.

  • The global-balancing layer then routes the request to the least connected server on the load-balancing layer of the appropriate cluster: nyc-load-balancing-1, sfo-load-balancing-3, lon-load-balancing-2, etc.. This layer is also a (minimum) trio of droplets.

  • The regional load-balancing layer then routes the request to the least connected server in the app or api layer: nyc-app-2, sfo-api-1, lon-api-3, etc…

The details of the Nginx kung fu can be found in this tutorial: Villiage Idiot: Setting up Nginx with GSLB/Reverse Proxy on AWS. More general info about Nginx load-balancing is available here and here.

Questions

Where do I put the global-balancing servers?

It strikes me as odd that I would put them either all in one place, or spread that layer out around the globe either. Say, for instance, I put them all in NYC. Then someone from France hits my domain. The request would go from France, to NYC, and then be routed back to LON. Or if I put one of each in SFO, NYC, and LON then isn't it still possible that a user from Toronto (Parkdale, represent) could send a request that ends up going to LON only to be routed back to NYC?

Do subsequent requests get routed to the same IP?

As in, if a user from Toronto sends a request that the global-balancing layer determines should be going to NYC, does the next request from that origin go directly to NYC, or is it still luck of the draw that it will hit the nearest global-balancing server (NYC in this case).

What about sessions?

I've configured Nginx to use the ip_hash; directive so it will direct the user to the same app or api endpoint (a node process, in my case) but how will global balancing affect this, if at all?

Any DNS Examples?

I'm not exactly a DNS expert (I'm currently trying to figure out why my CNAME records aren't resolving) but I'm a quick study when provided with a solid example. Has anyone gone through this process before and can provide a sample of what the DNS records look like for a successful setup?

What about SSL/TLS?

Would I need a certificate for every server, or just for the three global-balancing servers since that's the only public-facing gateway?

If you read this whole thing then reward yourself with a cupcake. Thanks in advance for any help.

解决方案

The Goal: Offer highly-available service to my users by routing all connections to the closest 'cluster' of servers in SFO, NYC, LON, and eventually Singapore.

The global-balancing layer then routes the request to theleast connected server...

If I'm reading your configuration correctly, you're actually proxying from your global balancers to the balancers at each region. This does not meet your goal of routing users to the nearest region.

There are three ways that I know of to get what you're looking for:

  1. 30x Redirect
    Your global balancers receive the HTTP request and then redirect it to a server group in or near the region it thinks the request is coming from, based on IP address. This sounds like what you were trying to set up. This method has side effects for some applications, and also increases the time it takes for a user to get data since you're adding a ton of overhead. This only makes sense if the resources you're redirecting to are very large, and the local regional cluster will be able to serve much more efficiently.

  2. Anycast (taking advantage of BGP routing)
    This is what the big players like Akamai use for their CDN. Basically, there are multiple servers out on the internet with the exact same routable IP address. Suppose I have servers in several regions, and they have the IP address of 192.0.2.1. If I'm in the US and try to connect to 192.0.2.1, and someone is in Europe that tries to connect to 192.0.2.1, it's likely that we'll be routed to the nearest server. This uses the internet's own routing to find the best path (based on network conditions) for the traffic. Unfortunately, you can't just use this method. You need your own AS number, and physical hardware. If you find a VPS provider that lets you have a chunk of their Anycast block, let me know!

  3. Geo-DNS
    There are some DNS providers that provide a service often marketed as "Geo-DNS". They have a bunch of DNS servers hosted on anycast addresses which can route traffic to your nearest servers. If a client queries a European DNS server, it should return the address for your European region servers, vs. some in other regions. There are many variations on the Geo DNS services. Others simply maintain a geo-IP database and return the server for the region they think is closer, just like the redirect method but for DNS before the HTTP request is ever made. This is usually the good option, for price and ease of use.

Do subsequent requests get routed to the same IP?

Many load balancers have a "stickiness" option that says requests from the same network address should be routed to the same end server (provided that end server is still up and running).

What about sessions?

This is exactly why you would want that stickiness. When it comes to session data, you are going to have to find a way to keep all your servers up-to-date. Realistically, this isn't always guaranteed. How you handle it depends on your application. Can you keep a Redis instance or whatever out there for all your servers to reliably hit from around the world? Do you really need that session data in every region? Or can you have your main application servers dealing with session data in one location?

Any DNS Examples?

Post separate questions for these. Everyone's "successful setup" looks differently.

What about SSL/TLS?

If you're proxying data, only your global balancers need to handle HTTPS. If you're redirecting, then all the servers need to handle it.

这篇关于如何使用 Digital Ocean DNS 和 Nginx 设置全局负载均衡?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆