基于paxos的复制键值存储的领导者选举 [英] Leader election for paxos-based replicated key value store

查看:134
本文介绍了基于paxos的复制键值存储的领导者选举的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将用多个Paxos实现一个键值存储。我将有几个节点,其中一个是主节点。这个主节点接收更新请求并将值复制到从节点。

I am going to implement a key value store with multi Paxos. I would have several nodes, one of which is the primary node. This primary node receive update requests and replicate values to slave nodes.

我的问题是如何选择主节点(或领导者)?我还能使用Paxos算法吗?如果是这样,您是否认为有必要将paxos实现抽象为一个单元,该单元不仅可以由复制单元使用,还可以由领导者选举单元使用?

My question is how the primary node (or leader) is selected? Can I still use the Paxos algorithm? If so, do you think it is necessary to abstract the paxos implementation to a single unit that could be used not only by the replication unit but also the leader election unit?

如果我使用ID最少的节点作为领导者?我该如何执行主租约?

If I use the node with the least id to be the leader? How can I implement the master lease?

谢谢您的回答。

推荐答案

在我提出实际问题之前,我建议对于一个类似paxos的系统,您不要将其视为主从关系,而是同等关系。基本Paxos甚至没有领导者概念。 Multi-paxos将领导者作为绩效优化,选择该领导者作为协议的一部分。

Before I get to the actual question, I would suggest that for a paxos-like system, you don't think of it as a master-slave relationship, but rather an equal-peer relationship. Basic Paxos doesn't even have a leader concept. Multi-paxos tacks on a leader as a performance optimization, electing that leader is part of the protocol.

Multi-Paxos归结为下面的Paxos:有一个准备阶段和接受阶段。 Multi-Paxos的洞察力是,一旦一个节点赢得了接受回合,它就同时赢得了领导者选举,此后,在检测到另一个节点已接管领导者之前,该领导者就不必进行准备阶段。

Multi-Paxos boils down to Paxos underneath: there is a prepare phase and an accept phase. The insight of Multi-Paxos is that once a node wins an accept round, it has simultaneously won leader election and after that the prepare phase isn't necessary from that leader until it detects that another node has taken over leadership.

现在提供一些实用建议。我有多年在多个paxos,multi-paxos和其他共识系统上工作的经验。

And now some practical advise. I have many years of experience working on several paxos, multi-paxos, and other consensus systems.

我首先建议实施Paxos或多人在保持性能正确的同时优化Paxos系统非常困难-特别是如果您遇到这些类型的问题。相反,我会研究实施Raft协议

I first suggest not implementing either Paxos or Multi-paxos. Optimizing Paxos systems for performance while keeping it correct is very hard—especially if you are having these types of questions. I would instead look into implementing the Raft protocol.

考虑到这两个协议,因为Raft协议可以提供更好的吞吐量比Multi-Paxos。 Raft的作者(和其他人)认为Raft更易于理解和实施。

Taking both protocols as is right off the paper, the Raft protocol can have much better throughput than Multi-Paxos. The Raft authors (and others) suggest that Raft is easier to understand, and implement.

您还可以考虑使用一种开源Raft系统。我没有任何经验可以告诉您维护的难易程度。不过,我听说维护Zookeeper实例很痛苦。 (我也听说过有关Zookeeper正确性证明的投诉。)

You may also look into using one of the open-source Raft systems. I don't have experience with any of them to tell you how easy it is to maintain. I have heard, though, of pain in maintaining Zookeeper instances. (I have also heard complaints about Zookeeper's correctness proof.)

接下来,事实证明,每个共识协议都可以永远循环。在您的系统中建立超时机制,并在适当的情况下分配随机补偿。这是实际的工程师如何解决理论上的不可能。

Next, it has been proven that every consensus protocol can loop forever. Build into your system a time-out mechanism, and randomized backoffs where appropriate. This is how practical engineers get around theoretical impossibilities.

最后,检查您的吞吐量需求。如果吞吐量足够高,那么您将需要弄清楚如何在多个共识集群之间进行分区。那就是整个蜡球。

Finally, examine your throughput needs. If your throughput is high enough, you will need to figure out how to partition across several consensus-clusters. And that's a whole 'nother ball of wax.

这篇关于基于paxos的复制键值存储的领导者选举的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆