跨科罗失败设计,DNS级别故障切换? [英] Cross-colo fail-over design, DNS level fail-over?

查看:166
本文介绍了跨科罗失败设计,DNS级别故障切换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对Web应用程序的跨架失败策略感兴趣,这样,如果主站点失败了用户无缝地在另一个colo中的故障切换站点。

I'm interested in cross-colo fail-over strategies for web applications, such that if the main site fails users seamlessly land at the fail-over site in another colo.

事情的应用方面大部分都是通过Colos和服务之间的主从数据库进行设计,以恢复中间流。我试图找出将流量从主站点移动到故障切换站点的策略。 DNS故障转移(即使是TTL较低)似乎也会携带公平的延迟时间

The application side of things looks to be mostly figured out with a master-slave database setup between the colos and services designed to recover and be able to pick up mid-stream. I'm trying to figure out the strategy for moving traffic from the main site to the fail-over site. DNS failover, even with low TTLs, seems to carry a fair bit of latency.

假设主colo上的服务器无法访问,您将推荐哪些策略快速移动到colos之间?

What strategies would you recommend for quickly moving traffic between colos, assuming the servers at the main colo are unreachable?

如果你有其他有趣的经验/智慧的交叉词汇,我也很乐意听到这些。

If you have other interesting experience / words of wisdom about cross-colo failover I'd love to hear those as well.

推荐答案

基于DNS的机制是麻烦的,即使你的区域文件中输入的是低TTL。

DNS based mechanisms are troublesome, even if you put low TTLs in your zone files.

原因是许多应用程序MSIE)保留自己的忽略TTL的缓存。其他软件将执行单个 gethostbyname()或等效的调用并存储结果,直到程序重新启动。

The reason for this is that many applications (e.g. MSIE) maintain their own caches which ignore the TTL. Other software will do a single gethostbyname() or equivalent call and store the result until the program is restarted.

更糟糕的是,许多ISP的递归DNS服务器被认为忽略低于自己的最低最小值的TTL,并施加自己较高的TTL。

Worse still, many ISPs' recursive DNS servers are known to ignore TTLs below their own preferred minimum and impose their own higher TTLs.

最终如果该站点从两个数据中心不用更改其IP地址,那么您需要通过全球BGP4路由公告来查看多宿主的安排。

Ultimately if the site is to run from both data centers without changing its IP address then you need to look at arrangements for "Multihoming" via global BGP4 route announcements.

使用多宿主获取至少一个提供者独立(又称PI)IP地址空间的/ 24个网络块,然后只有主站点脱机才能从备份站点通知全局路由表。

With multihoming you need to get at least a /24 netblock of "provider independent" (aka "PI") IP address space, and then have that only be announced to the global routing table from the backup site if the main site goes offline.

这篇关于跨科罗失败设计,DNS级别故障切换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆