Cassandra 如何选择发送请求的节点? [英] How Cassandra select the node to send request?

查看:20
本文介绍了Cassandra 如何选择发送请求的节点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想象一个 Cassandra 集群需要由客户端应用程序访问.在 Java api 中,我们创建一个集群实例并通过 Session 发送读或写请求.如果我们使用读/写一致性ONE,api如何选择实际节点(协调器节点)以转发请求.是随机选择吗?请帮忙解决这个问题.

Imagine a Cassandra cluster needs to be accessed by a client application. In Java api we create a cluster instance and send the read or write request via a Session. If we use read/write consistency ONE, how the api select the actual node (coordinator node) in order to forward the request. Is it a random selection? please help to figure this out.

推荐答案

Cassandra 驱动程序使用gossip"协议(以及称为节点发现的过程)来获取有关集群的信息.如果节点变得不可用,客户端驱动程序会自动尝试其他节点并安排与死节点的重新连接时间.根据 到 DataStax 文档:

Cassandra drivers use the "gossip" protocol (and a process called node discovery) to gain information about the cluster. If a node becomes unavailable, the client driver automatically tries other nodes and schedules reconnection times with the dead one(s). According to the DataStax docs:

Gossip 是一种点对点通信协议,其中节点定期交换关于他们自己和关于他们知道的其他节点.gossip 进程每秒运行一次与集群中最多三个其他节点交换状态消息.节点交换关于自己和关于对方的信息他们八卦的节点,所以所有节点都很快了解集群中的所有其他节点.

Gossip is a peer-to-peer communication protocol in which nodes periodically exchange state information about themselves and about other nodes they know about. The gossip process runs every second and exchanges state messages with up to three other nodes in the cluster. The nodes exchange information about themselves and about the other nodes that they have gossiped about, so all nodes quickly learn about all other nodes in the cluster.

本质上,您为客户端提供的要连接的节点列表是获取整个集群信息的初始联系点.这就是为什么您的客户端可以与集群中的所有节点(如果需要)进行通信的原因,即使您可能只在连接字符串中提供一小部分节点.

Essentially, the list of nodes that you provide your client to connect to, are the initial contact points for gaining information on the entire cluster. This is why your client can communicate with all nodes in the cluster (if need be) even though you may only provide a small subset of nodes in your connection string.

一旦您的驱动程序获得有关集群的八卦信息,它就可以智能地决定在哪个节点上运行查询.节点选择不是投票或随机选择的过程.根据返回的八卦信息,客户端驱动应用其 负载平衡策略.虽然它确实考虑了几个因素,但基本上它会尝试选择与客户端网络距离"最小的节点.

Once your driver has the gossip information on the cluster, it can then make intelligent decisions about which node to run a query on. Node selection is not a process of voting or random selection. Based on the gossip information returned, the client driver applies its Load Balancing Policy. While it does take several factors into consideration, basically it tries to pick the node with the lowest network "distance" from the client.

编辑 20200322

让我稍微扩展一下有关负载平衡策略的点.我鼓励高性能应用程序的开发人员使用 TokenAwarePolicy.此策略将分区键值散列到令牌",并使用此散列来确定哪个节点负责生成的令牌范围.这具有跳过选择协调器"节点的中间步骤的效果,并将查询直接发送到包含所请求数据的节点.

Let me expand a bit on the point about the Load Balancing policy. I encourage developers of high-performance applications to use the TokenAwarePolicy. This policy hashes the partition key values to a "token," and uses this hash to determine which node(s) is responsible for the resulting token range. This has the effect of skipping the intermediate step of selecting a "coordinator" node, and sends the queries directly to the node which contains the requested data.

但是,如果您使用的是非令牌感知负载平衡策略,或运行不过滤分区键的查询,则上述原始过程适用.

However, if you are using a non-token aware load balancing policy, or running a query which does not filter on a partition key, then the original process described above applies.

这篇关于Cassandra 如何选择发送请求的节点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆