是否可以仅从复制因子为3的Cassandra群集中的单个节点读取数据? [英] Is it possible to read data only from a single node in a Cassandra cluster with a replication factor of 3?

查看:86
本文介绍了是否可以仅从复制因子为3的Cassandra群集中的单个节点读取数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道Cassandra具有不同的读取一致性级别,但是我没有看到一致性级别允许仅从一个节点按键读取数据。我的意思是,如果我们有一个复制因子为3的集群,那么在读取时我们将始终询问所有节点。即使我们将一致性级别选择为1,我们也会询问所有节点,但要等待来自任何节点的第一个响应。这就是为什么我们读取时不仅加载一个节点,而且加载3个节点(协调节点为4个节点)的原因。我认为即使设置更大的复制因子也无法真正提高读取性能。


是否真的可以只从单个节点读取?

I know that Cassandra have different read consistency levels but I haven't seen a consistency level which allows as read data by key only from one node. I mean if we have a cluster with a replication factor of 3 then we will always ask all nodes when we read. Even if we choose a consistency level of one we will ask all nodes but wait for the first response from any node. That is why we will load not only one node when we read but 3 (4 with a coordinator node). I think we can't really improve a read performance even if we set a bigger replication factor.

Is it possible to read really only from a single node?

推荐答案

您是否正在使用令牌感知负载平衡策略?

Are you using a Token-Aware Load Balancing Policy?

如果您使用的是并且您正在以LOCAL_ONE / ONE的一致性进行查询,则读查询应仅与单个节点联系。

If you are, and you are querying with a consistency of LOCAL_ONE/ONE, a read query should only contact a single node.

给出文章弹性驱动程序的思想和测试。在其中,您会注意到使用TokenAwarePolicy具有以下效果:

Give the article Ideology and Testing of a Resilient Driver a read. In it, you'll notice that using the TokenAwarePolicy has this effect:


对于具有单个数据中心的情况,TokenAwarePolicy选择主要副本将成为选择的协调者,希望通过避免典型的协调者-副本跳来减少延迟。

"For cases with a single datacenter, the TokenAwarePolicy chooses the primary replica to be the chosen coordinator in hopes of cutting down latency by avoiding the typical coordinator-replica hop."

这就是发生了什么。假设我有一张表格来跟踪 Kerbalnauts ,我想获取所有数据法案。我会使用这样的查询:

So here's what happens. Let's say that I have a table for keeping track of Kerbalnauts, and I want to get all data for "Bill." I would use a query like this:

SELECT * FROM kerbalnauts WHERE name='Bill';

驱动程序将我的分区键值(名称)哈希为 4639906948852899531 (<< c $ c> SELECT令牌(名称)来自kerbalnauts,其中name ='Bill'; 返回该值)。如果我使用6节点群集,则我的主要令牌范围将如下所示:

The driver hashes my partition key value (name) to the token of 4639906948852899531 (SELECT token(name) FROM kerbalnauts WHERE name='Bill'; returns that value). If I am working with a 6-node cluster, then my primary token ranges will look like this:

node   start range              end range
1)     9223372036854775808 to  -9223372036854775808
2)    -9223372036854775807 to  -5534023222112865485
3)    -5534023222112865484 to  -1844674407370955162
4)    -1844674407370955161 to   1844674407370955161
5)     1844674407370955162 to   5534023222112865484
6)     5534023222112865485 to   9223372036854775807

由于节点5负责包含以下内容的令牌范围分区键 Bill,我的查询将发送到节点5。由于我以LOCAL_ONE的一致性进行读取,因此无需联系其他节点,结果将返回给客户端...具有

As node 5 is responsible for the token range containing the partition key "Bill," my query will be sent to node 5. As I am reading at a consistency of LOCAL_ONE, there will be no need for another node to be contacted, and the result will be returned to the client...having only hit a single node.

注意:令牌范围的计算方式为:

python -c'print [str(((2**64 /5) * i) - 2**63) for i in range(6)]'

这篇关于是否可以仅从复制因子为3的Cassandra群集中的单个节点读取数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆