Cassandra中的4节点设置与3节点设置相同 [英] 4 node setup in cassandra is as same as 3 node setup

查看:154
本文介绍了Cassandra中的4节点设置与3节点设置相同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Cassandra中有一个4节点设置,并决定采用以下配置,但是ppl表示这将与3节点设置相同,所以有人可以告诉我为什么,

 节点= 3,复制因子= 2,写一致性= 2,读一致性= 1 
节点= 4,复制因子= 3,写一致性= 3,读一致性= 1

根据我的理解,节点= 4,请提供这两个节点故障,将RF设置为'3'是有益的,但是ppl表示在4节点设置中RF = 2与RF = 3相同,请解释一下原因?



谢谢,
哈里

解决方案

您的问题是有点不清楚,因为我认为您没有正确构造问题。但是,我将尝试解释一些有助于您理解它的要点。



节点= 4,提供两个节点失败,将RF设置为'3'






  1. 节点数未计入读写失败的因素。 RF(复制因子)和CL(一致性级别)是读取/写入失败(如果必需的副本或节点已关闭)的决定因素。

RF->将保留多少个数据(行)副本。 (有多少服务器或节点将保留相同的行/数据)。



CL->确认需要多少个节点才能让客户端知道/通知写/读操作成功。这意味着至少有数个被称为CL的节点(例如:如果CL为2,则至少有2个节点)必须确认/确保它们已成功写入数据或已从这些副本中读取数据(等待所有必需的副本返回)将结果返回到协调器节点)并合并结果(如果不同的节点对同一数据进行不同的更新,则保留最新数据),然后将结果成功返回给客户端。



注意:如果RF = CL,则您使用的CL等于ALL。



ALL 是最高的一致性级别(可以肯定的是数据是最新的,但是如果单个副本出现故障则不可用)



场景1:



节点= 3,复制因子= 2,写入一致性= 2,读取一致性= 1



对于写操作:



因为您使用了最高级别写入CL(RF和写入CL值相同),则这将是单点故障的情况。必须保留所有必需的副本,以确认客户端已在两个节点中成功写入数据。



对于读取操作:



读取CL为1。因此,如果一个副本出现故障,它可以生存。因为只有一个副本需要将结果返回给客户端。可能是旧数据(如果数据更新仍未传播到该节点,但最终将保持一致),但读取将成功。



方案2:



节点= 3,复制因子= 3,写入一致性= 2,读取一致性= 1



对于写操作:



由于节点数= RF,所有数据将被复制到所有节点(100%拥有)。



对于读操作:
如果两个副本被关闭,它可以生存。



场景3:



节点= 4,复制因子= 2,写入一致性= 2 ,读取一致性= 1



对于写操作:



与场景1相同



用于读取操作:



与场景1相同。



方案4:



节点= 4,复制因子= 3,写入一致性= 3,读取一致性= 1



用于写入操作:



与方案1相同。



用于读取操作:



与场景2相同。



相关链接:



了解卡桑德拉复制因子与一致性水平



有关详细信息,请参见 DataStax文档



已编辑



<如果您担心节点故障情况(读取或写入请求失败),则节点数无关紧要。



假定您有3/4/5个节点,如果RF为3,CL为QUORUM(3/2 + 1〜2),则群集可以容忍1个副本节点下。请阅读上面链接中的关于QUORUM级别部分。



如果您有更多节点,群集可以处理更多数据或在节点之间正确加载和分发数据。但是请求故障转移方案将是相同的。


节点= 3,复制因子= 3,写入一致性= 2,读取
一致性= 1



节点= 4,复制因子= 3,写入一致性= 2,读取
一致性= 1



节点= 5,复制因子= 3,写入一致性= 2,读取
一致性= 1


由于RF为3,写入和读取CL分别为2和1,群集可以容忍一个副本向下进行写操作,而两个副本向下进行读操作。希望对您有所帮助。


I have a 4 node setup in Cassandra and decided to go with the following configuration, but ppl are saying this will be same as 3 node setup, So could somebody please give me a light and say why,

Nodes = 3, Replication Factor = 2, Write Consistency = 2, Read Consistency = 1
Nodes = 4, Replication Factor = 3, Write Consistency = 3, Read Consistency = 1

As per my understanding, Nodes = 4, provide the two node failure, It is beneficial to have RF as '3' but ppl are saying RF = 2 will be same as RF = 3 in a 4 node setup, Could you please explain why?

Thanks, Harry

解决方案

Your question is little bit unclear as I think you haven't properly constructed the question. But I will try to explain some points that would help you to understand it.

Nodes = 4, provide the two node failure, It is beneficial to have RF as '3'

  1. Number of nodes is not counting factor for read/ write failures. RF (Replication Factor) and CL (Consistency Level) are the deciding factors for read/write failures (if required replicas or nodes are down).

RF -> How many copies of data (row) will be kept. (How many servers or nodes will keep the same row/data).

CL -> Acknowledgement of how many nodes is required to let client know/inform that write/read operation is successful. That means at least numbers of nodes mentioned as CL (Ex: If CL is 2 at least 2 nodes) have to acknowledge/ensure that they have written the data successfully or the data is read from those replicas (wait until all the required replicas return the result to the coordinator node) and merge the results (keep the latest data if different nodes have different updates of same data) and successfully return results to the client.

Note: If RF = CL, then you have used CL equivalent to ALL.

ALL is highest level of consistency level (Data will be up-to-date for sure but not be available if a single replica is down)

Scenario 1:

Nodes = 3, Replication Factor = 2, Write Consistency = 2, Read Consistency = 1

For Write operation:

As you have used highest level of write CL (RF and write CL value is same), then this will be a case of single point of failure. All required replicas have to be alive to acknowledge the client that data has been written successfully in both nodes.

For Read operation:

Read CL is ONE. So it can survive if one replica is down. Cause only one replica needs to return the result to the client. It may be old data (If update of data is not still propagated to this node, but eventually it will be consistent), but read will be successful.

Scenario 2:

Nodes = 3, Replication Factor = 3, Write Consistency = 2, Read Consistency = 1

For Write operation:

As number of nodes = RF, all data will be copied in all nodes (100% own). It will survive one node/replica down.

For Read operation: It can survive if two replicas are down.

Scenario 3:

Nodes = 4, Replication Factor = 2, Write Consistency = 2, Read Consistency = 1

For Write operation:

Same as scenario 1.

For Read operation:

Same as scenario 1.

Scenario 4:

Nodes = 4, Replication Factor = 3, Write Consistency = 3, Read Consistency = 1

For Write operation:

Same as scenario 1.

For Read operation:

Same as scenario 2.

Related Link:

Understand cassandra replication factor versus consistency level

For details follow DataStax Doc.

Edited

Number of nodes does not matter if you are concerned for node failure scenario (read or write requests fail).

Assume you have 3/4/5 nodes, if RF is 3 and CL is QUORUM (3/2 + 1 ~ 2), the cluster can tolerate 1 replica nodes down. Please read the About the QUORUM level section from above link.

If you have more nodes, the cluster can handle more data or load and distribute the data properly among nodes. But the request fail over scenario will be the same.

Nodes = 3, Replication Factor = 3, Write Consistency = 2, Read Consistency = 1

Nodes = 4, Replication Factor = 3, Write Consistency = 2, Read Consistency = 1

Nodes = 5, Replication Factor = 3, Write Consistency = 2, Read Consistency = 1

As RF is 3 and Write and Read CL is 2 and 1 respectively, cluster can tolerate one replica down for write and two replicas down for a read operation. I hope this helps you.

这篇关于Cassandra中的4节点设置与3节点设置相同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆