为灾难恢复配置 Apache Cassandra [英] Configuring Apache Cassandra for Disaster Recovery

查看:26
本文介绍了为灾难恢复配置 Apache Cassandra的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您如何配置 Apache Cassandra 以允许灾难恢复,以允许两个数据中心之一发生故障?

How do you configure Apache Cassandra to allow for disaster recovery, to allow for one of two data-centres to fail?

DataStax 文档 讨论了使用复制策略确保在至少一个复制被写入您的两个数据中心中的每一个.但是,一旦灾难真正发生,我看不出这有什么帮助.如果您切换到剩余的数据中心,您的所有写入都将失败,因为这些写入将无法复制到其他数据中心.

The DataStax documentation talks about using a replication strategy that ensures at least one replication is written to each of your two data-centres. But I don't see how that helps once the disaster has actually happened. If you switch to the remaining data-centre, all your writes will fail because those writes will not be able to replicate to the other data-centre.

我猜您希望您的软件在两种模式下运行:正常模式,写入必须在两个数据中心之间复制,以及灾难模式,它们不需要.但改变复制策略似乎不太可能.

I guess you would want your software to operate in two modes: normal mode, for which writes must replicate across both data-centres, and disaster mode, for which they need not. But changing replication strategy does not seem possible.

我真正想要的是两个过度配置的数据中心,并且在正常操作期间使用两个数据中心的资源,但仅使用一个剩余数据中心的资源(性能降低),当只有一个时数据中心正在运作.

What I really want is two data-centres that are over provisioned, and during normal operations use the resources of both data-centres, but use the resources of only the one remaining data-centre (with reduced performance) when only one data-centre is functioning.

推荐答案

诀窍是改变通过 API 为写入提供的一致性设置,而不是改变复制因子.当只有一个数据中心可用时,在灾难期间使用 LOCAL_QUORUM 设置进行写入.在正常操作期间,使用 EACH_QUORUM 来确保两个数据中心都有一份数据副本.读取可以一直使用LOCAL_QUORUM.

The trick is to vary the consistency setting given through the API for writes, instead of varying the replication factor. Use the LOCAL_QUORUM setting for writes during a disaster, when only one data-centre is available. During normal operation use EACH_QUORUM to ensure both data-centres have a copy of the data. Reads can use LOCAL_QUORUM all the time.

以下是 多个数据中心的 Datastax 文档摘要 以及较旧但在概念上仍然相关的灾难恢复 (0.7).

Here is a summary of the Datastax documentation for multiple data centers and the older but still conceptionally relevant disaster recovery (0.7).

使用 LOCAL_QUORUMEACH_QUORUM 两个一致性来制作满足您需求的配方.

Make a recipe to suite your needs with the two consistencies LOCAL_QUORUM and EACH_QUORUM.

这里,本地"是指在单个数据中心本地,而每个"是指在每个数据中心都严格保持在同一级别的一致性.

Here, "local" means local to a single data center, while "each" means consistency is strictly maintained at the same level in each data center.

假设您有 2 个数据中心,其中一个严格用于灾难恢复,那么您可以将复制因子设置为...

Suppose you have 2 datacenters, one used strictly for disaster recovery then you could set the replication factor to...

主写/读中心3个,故障转移数据中心2个

3 for the primary write/read center, and two for the failover data center

现在,根据您的数据实际写入灾难恢复节点的重要性,您可以使用 EACH_QUORUM 或 LOCAL_QUORUM.假设您使用的是复制放置策略 NetworkTopologyStrategy (NTS),

Now depending how critical it is that your data is actually written to the disaster recovery nodes, you can either use EACH_QUORUM or LOCAL_QUORUM. Assuming you are using a replication placement strategy NetworkTopologyStrategy (NTS),

LOCAL_QUORUM 写入只会延迟客户端本地写入 DC1 并异步写入 DC2 中的恢复节点.

LOCAL_QUORUM on writes will only delay the client to write locally to the DC1 and asynchronously write to your recovery node(s) in DC2.

EACH_QUORUM 将确保复制所有数据,但会延迟写入,直到两个 DC 都确认操作成功.

EACH_QUORUM will ensure that all data is replicated but will delay writes until both DCs confirm successful operations.

对于读取,最好仅使用 LOCAL_QUORUM 以避免数据中心间延迟.

For reads it's likely best to just use LOCAL_QUORUM to avoid inter-data center latency.

这种方法有一些问题!如果您选择在写入时使用 EACH_QUORUM,则会增加潜在的故障点(DC2 已关闭、DC1-DC2 链接已关闭、无法满足 DC1 法定人数).

There are catches to this approach! If you choose to use EACH_QUORUM on your writes you increase the potential failure points (DC2 is down, DC1-DC2 link is down, DC1 quorum can't be met).

好处是,一旦您的 DC1 出现故障,您就有了有效的 DC2 灾难恢复.另请注意,在第二个链接中,它谈到了用于正确路由您的 IP 的自定义告密者设置.

The bonus is once your DC1 goes down, you have a valid DC2 disaster recovery. Also note in the 2nd link it talks about custom snitch settings for routing your IPs properly.

这篇关于为灾难恢复配置 Apache Cassandra的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆