Zookeeper 容错究竟是什么意思?同时或累积? [英] what does Zookeeper fault tolerant exactly mean ? simultaneously Or accumulatively?

查看:44
本文介绍了Zookeeper 容错究竟是什么意思?同时或累积?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如 ZooKeeper 入门指南中所述,容错集群设置至少需要三台服务器,强烈建议您使用奇数台服务器.

As mentioned in the ZooKeeper Getting Started Guide , a minimum of three servers are required for a fault tolerant clustered setup, and it is strongly recommended that you have an odd number of servers.

所以如果我有 5 台服务器,并且如上所述,当其中 2 台发生故障时我仍然可以生存.但这意味着同时或累积 ??

So If I got 5 servers, and as mentioned above I can still survive when 2 of them failed.But It means simultaneously Or accumulatively ??

那么这个怎么样:
5台服务器->失败1台->4台服务器->失败1台->3台服务器->失败1台->2台服务器->失败1台->死

So how about this :
5 servers -> fail one -> 4 servers -> fail one -> 3 servers -> fail one -> 2 servers -> fail one -> die

3台服务器(初始化)和3台服务器(5台服务器退化)有什么区别??

And what's the difference between 3 servers(initialization) and 3 servers (degeneration from 5 servers) ??

推荐答案

Zookeeper 集群要工作,它需要仲裁.仲裁是集群中的大多数服务器.

For Zookeeper cluster to work, it needs quorum. And quorum is the majority of servers from the cluster.

  • 对于 3 节点集群,大多数是 2 个节点.因此,您只能容忍 1 个节点不同步.
  • 对于 5 个节点的集群,大多数是 3 个节点.因此,您只能容忍 2 个节点不同步.
  • 对于 7 个节点的集群,大多数是 4 个节点.因此,您只能容忍 3 个节点不同步.

同步是什么意思?该节点不仅在未运行时不属于仲裁.但也包括它在失败后仍在重新加入集群时.

What does being in sync mean? The node is not part of the quorum not only when it is not running. But also when it is still rejoining the cluster after a failure.

节点在 Zookeeper 配置中硬编码.因此集群中的每个节点都知道它应该是具有 N 个节点的集群的一部分.因此,它不会以两个节点宕机的 7 节点集群突然变成 5 节点集群而另外 2 个节点可能宕机的方式工作.它将始终表现为 7 个节点的集群,除非您更改配置文件,否则只有 3 个节点可以关闭.

The nodes are hardcoded in Zookeeper configuration. So each node in the cluster know that it should be part of a cluster with N nodes. Therefore it doesn't work in the way that a 7 node cluster where two nodes are down is suddenly a 5 node cluster and another 2 nodes can go down. It will always behave as a 7 node cluster and only 3 nodes can go down unless you change the configuration files.

关于偶数和奇数节点的整个事情基本上是关于在保持法定人数时可能关闭的节点数量.对于 4 节点集群,大多数将是 3 个.因此 4 节点集群仍然只能容忍 1 个节点宕机.因此,使用与 3 节点集群具有相同容错能力的 4 节点集群没有多大意义.

The whole thing about even and odd number of nodes is basically about the number of nodes which could be down while maintaining the quorum. And with 4 node cluster, the majority will be 3. So 4 node cluster can still tolerate only 1 node being down. Hence it doesn't make much sense to use 4 node cluster which has the same fault tolerance as the 3 node cluster.

这篇关于Zookeeper 容错究竟是什么意思?同时或累积?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆