Zookeeper容错的确切含义是什么?同时还是累计? [英] what does Zookeeper fault tolerant exactly mean ? simultaneously Or accumulatively?

查看:87
本文介绍了Zookeeper容错的确切含义是什么?同时还是累计?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如《 ZooKeeper入门指南》所述,容错群集设置至少需要三台服务器,并且强烈建议您使用奇数个服务器.

As mentioned in the ZooKeeper Getting Started Guide , a minimum of three servers are required for a fault tolerant clustered setup, and it is strongly recommended that you have an odd number of servers.

因此,如果我有5台服务器,并且如上所述,当其中2台服务器出现故障时,我仍然可以生存.但这意味着同时或累计??

So If I got 5 servers, and as mentioned above I can still survive when 2 of them failed.But It means simultaneously Or accumulatively ??

那呢:
5台服务器->失败一台-> 4台服务器->失败一台-> 3台服务器->失败一台-> 2台服务器->失败一台->死亡

So how about this :
5 servers -> fail one -> 4 servers -> fail one -> 3 servers -> fail one -> 2 servers -> fail one -> die

3个服务器(初始化)和3个服务器(从5个服务器退化)之间有什么区别??

And what's the difference between 3 servers(initialization) and 3 servers (degeneration from 5 servers) ??

推荐答案

为使Zookeeper群集正常工作,需要仲裁.仲裁是群集中大多数服务器.

For Zookeeper cluster to work, it needs quorum. And quorum is the majority of servers from the cluster.

  • 对于3节点群集,大多数是2节点.因此,您只能容忍1个节点无法同时同步.
  • 在5个节点的群集中,大多数是3个节点.因此,您只能容忍2个节点无法同时同步.
  • 对于7节点群集,大多数为4节点.因此,您只能忍受3个节点无法同时同步.

同步是什么意思?节点不仅不在运行时也不是仲裁的一部分.但是当它仍然在失败后重新加入集群时.

What does being in sync mean? The node is not part of the quorum not only when it is not running. But also when it is still rejoining the cluster after a failure.

这些节点在Zookeeper配置中进行了硬编码.因此,群集中的每个节点都知道它应该是具有N个节点的群集的一部分.因此,它的工作方式并不是两个节点都关闭的7节点群集突然变成5节点群集,而另外2个节点可以关闭.除非您更改配置文件,否则它将始终充当7节点群集,并且只有3个节点可以宕机.

The nodes are hardcoded in Zookeeper configuration. So each node in the cluster know that it should be part of a cluster with N nodes. Therefore it doesn't work in the way that a 7 node cluster where two nodes are down is suddenly a 5 node cluster and another 2 nodes can go down. It will always behave as a 7 node cluster and only 3 nodes can go down unless you change the configuration files.

关于偶数和奇数个节点的整个事情基本上是关于在维持仲裁数的情况下可能发生故障的节点数.并且在4节点群集中,大多数将为3.因此4节点群集仍只能容忍1个节点出现故障.因此,使用具有与3节点群集相同的容错能力的4节点群集没有多大意义.

The whole thing about even and odd number of nodes is basically about the number of nodes which could be down while maintaining the quorum. And with 4 node cluster, the majority will be 3. So 4 node cluster can still tolerate only 1 node being down. Hence it doesn't make much sense to use 4 node cluster which has the same fault tolerance as the 3 node cluster.

这篇关于Zookeeper容错的确切含义是什么?同时还是累计?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆