是否可以在不停机的情况下扩展/缩减 Aurora RDS 集群? [英] Is it possible to scale up/down Aurora RDS Cluster without downtime?
问题描述
我有两个实例的 RDS Aurora PostgreSQL 集群:
I have RDS Aurora PostgreSQL cluster with two instances:
cluster
├── instance_1 [writer] [no multiAZ]
└── instance_2 [reader] [no multiAZ]
当我更改 instance_1
故障转移操作的实例类型时,故障转移操作正常,但我的停机时间约为 1~2 分钟.我通过运行检查了停机时间
When I changing instance type for instance_1
failover operation working correct but I have downtime about 1~2 minutes. I checked downtime by running
watch -n 3 "psql -h db.cluster.url -p 5432 -d postgres -U postgres -c 'select ID from TABLE limit 1'"
之后 instance_1
成为 reader
.
有没有什么办法可以手动将instance_1
改成reader,改一下类型再恢复到writer,不用长时间停机(不停机最好,5~10秒也可以)
Is there any way to change instance_1
to reader manually, change it type and revert to writer without long downtime (no downtime is the best, but 5~10 seconds also acceptable)
我知道我可能会使用多可用区实例,但成本会高两倍.
I know that I may use multiAZ instances but it will be cost twice expensive.
推荐答案
使用 RDS 代理 可以大大减少故障转移期间的停机时间:
Using RDS Proxy can greatly reduce downtime during failover:
借助 RDS Proxy,Aurora 和 RDS 数据库的故障转移时间最多可减少 66%
With RDS Proxy, failover times for Aurora and RDS databases are reduced by up to 66%
大量看似漫长的故障转移由
A big amount of the seemingly long failover is taken by
- 从连接丢失中恢复的客户端库
- 读写器开关的 DNS 传播
RDS 代理处理读取器/写入器切换,因此无需将 DNS 更改传播到客户端,它始终使用相同的端点.
RDS Proxy handles the reader/writer switch so that no DNS changes have to be propagated to the client, it uses always the same end point.
有一个好文章 RDS 代理,其中显示了使用 RDS 代理时平均故障转移恢复时间从 24 秒提高到 3 秒.
There is a good article RDS Proxy which shows the speedup from 24 to 3 seconds average failover recovery time when using RDS proxy.
这篇关于是否可以在不停机的情况下扩展/缩减 Aurora RDS 集群?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!