解释Apache ZooKeeper [英] Explaining Apache ZooKeeper

查看:109
本文介绍了解释Apache ZooKeeper的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解ZooKeeper,它的工作方式和作用.有没有可以与ZooKeeper媲美的应用程序?

I am trying to understand ZooKeeper, how it works and what it does. Is there any application which is comparable to ZooKeeper?

如果您知道,那您将如何向外行描述ZooKeeper?

If you know, then how would you describe ZooKeeper to a layman?

我已经尝试过Apache Wiki,Zookeeper Sourceforge ...,但我仍然无法与之联系.

I have tried apache wiki, zookeeper sourceforge...but I am still not able to relate to it.

我只是通过 http://zookeeper.sourceforge.net/index.sf.shtml ,这样没有更多的服务吗?它像复制服务器服务一样简单吗?

I just read thru http://zookeeper.sourceforge.net/index.sf.shtml, so aren't there more services like this? Is it as simple as just replicating a server service?

推荐答案

简而言之,ZooKeeper可以帮助您构建分布式应用程序.

In a nutshell, ZooKeeper helps you build distributed applications.

您可能会将ZooKeeper描述为具有最终一致性的复制同步服务.由于持久化的数据分布在多个节点之间(这组节点称为集合"),并且一个客户端连接到它们中的任何一个(即特定的服务器"),因此在一个节点发生故障时进行迁移,因此它是鲁棒的.只要严格地说大多数节点都在工作,ZooKeeper节点的集合就是活着的.特别是,在集合内通过协商一致来动态选择主节点.如果主节点发生故障,则主节点的角色将迁移到另一个节点.

You may describe ZooKeeper as a replicated synchronization service with eventual consistency. It is robust, since the persisted data is distributed between multiple nodes (this set of nodes is called an "ensemble") and one client connects to any of them (i.e., a specific "server"), migrating if one node fails; as long as a strict majority of nodes are working, the ensemble of ZooKeeper nodes is alive. In particular, a master node is dynamically chosen by consensus within the ensemble; if the master node fails, the role of master migrates to another node.

主控权是写操作的权限:通过这种方式,可以确保写操作按顺序保留,即,写操作线性.每次客户端写入合奏时,大多数节点都会保存该信息:这些节点包括客户端的服务器,当然也包括主服务器.这意味着每次写入都会使服务器与主服务器保持最新状态.但是,这也意味着您不能同时进行写操作.

The master is the authority for writes: in this way writes can be guaranteed to be persisted in-order, i.e., writes are linear. Each time a client writes to the ensemble, a majority of nodes persist the information: these nodes include the server for the client, and obviously the master. This means that each write makes the server up-to-date with the master. It also means, however, that you cannot have concurrent writes.

保证线性写入是导致ZooKeeper在以写入为主的工作负载方面表现不佳的原因.特别是,不应将其用于交换大数据(例如媒体).只要您的通信涉及共享数据,ZooKeeper就会为您提供帮助.当可以同时写入数据时,ZooKeeper实际上会妨碍您的工作,因为从编写者的角度来看,即使不是绝对必要的,它也会对操作进行严格的排序.它的理想用途是用于协调,即在客户端之间交换消息.

The guarantee of linear writes is the reason for the fact that ZooKeeper does not perform well for write-dominant workloads. In particular, it should not be used for interchange of large data, such as media. As long as your communication involves shared data, ZooKeeper helps you. When data could be written concurrently, ZooKeeper actually gets in the way, because it imposes a strict ordering of operations even if not strictly necessary from the perspective of the writers. Its ideal use is for coordination, where messages are exchanged between the clients.

这是ZooKeeper的优势:读取是并发的,因为它们由客户端连接到的特定服务器提供服务.但是,这也是最终保持一致性的原因:客户端的视图"可能已过时,因为主服务器会以有限但不确定的延迟来更新相应的服务器.

This is where ZooKeeper excels: reads are concurrent since they are served by the specific server that the client connects to. However, this is also the reason for the eventual consistency: the "view" of a client may be outdated, since the master updates the corresponding server with a bounded but undefined delay.

ZooKeeper的复制数据库包括一棵 znodes 树,它们是大致代表文件系统节点(将其视为目录)的实体.每个znode可以通过存储数据的字节数组来丰富.而且,每个znode之下可能都有其他znode,实际上形成了一个内部目录系统.

The replicated database of ZooKeeper comprises a tree of znodes, which are entities roughly representing file system nodes (think of them as directories). Each znode may be enriched by a byte array, which stores data. Also, each znode may have other znodes under it, practically forming an internal directory system.

有趣的是,znode的名称可以是 sequential ,这意味着客户端在创建znode时提供的名称只是一个前缀:全名也由该名称选择.合奏.例如,这对于同步目的很有用:如果多个客户端想要获得资源锁,则每个客户端都可以在一个位置上并发地创建顺序znode:获得最低编号的人有权获得该锁.

Interestingly, the name of a znode can be sequential, meaning that the name the client provides when creating the znode is only a prefix: the full name is also given by a sequential number chosen by the ensemble. This is useful, for example, for synchronization purposes: if multiple clients want to get a lock on a resource, they can each concurrently create a sequential znode on a location: whoever gets the lowest number is entitled to the lock.

此外,一个znode可能是 ememeral :这意味着该znode会在创建它的客户端断开连接后立即销毁.这主要用于了解客户端何时发生故障,这在客户端自身具有应由新客户端承担的责任时可能是相关的.以锁为例,一旦拥有该锁的客户端断开连接,其他客户端就可以检查他们是否有权使用该锁.

Also, a znode may be ephemeral: this means that it is destroyed as soon as the client that created it disconnects. This is mainly useful in order to know when a client fails, which may be relevant when the client itself has responsibilities that should be taken by a new client. Taking the example of the lock, as soon as the client having the lock disconnects, the other clients can check whether they are entitled to the lock.

如果我们需要定期轮询znodes的状态,则与客户端断开连接有关的示例可能会出现问题.幸运的是,ZooKeeper提供了一个事件系统,可以在znode上设置 watch .如果专门更改或删除了znode或在其下创建了新的子代,则可以将这些手表设置为触发事件.与znode的顺序选项和临时选项结合使用时,这显然很有用.

The example related to client disconnection may be problematic if we needed to periodically poll the state of znodes. Fortunately, ZooKeeper offers an event system where a watch can be set on a znode. These watches may be set to trigger an event if the znode is specifically changed or removed or new children are created under it. This is clearly useful in combination with the sequential and ephemeral options for znodes.

Zookeeper使用的一个典型示例是分布式内存计算,其中一些数据在客户端节点之间共享,并且必须以非常谨慎的方式访问/更新以说明同步.

A canonical example of Zookeeper usage is distributed-memory computation, where some data is shared between client nodes and must be accessed/updated in a very careful way to account for synchronization.

ZooKeeper提供了用于构建同步原语的库,而运行分布式服务器的能力则避免了使用集中式(类似经纪人)消息存储库时出现的单点故障问题.

ZooKeeper offers the library to construct your synchronization primitives, while the ability to run a distributed server avoids the single-point-of-failure issue you have when using a centralized (broker-like) message repository.

ZooKeeper的功能很轻,这意味着诸如领导者选举,锁定,屏障等机制尚不存在,但可以在ZooKeeper原语之上编写. 如果C/Java API对于您的用途而言过于笨拙,则应依靠ZooKeeper构建的库,例如 cages ,尤其是馆长.

ZooKeeper is feature-light, meaning that mechanisms such as leader election, locks, barriers, etc. are not already present, but can be written above the ZooKeeper primitives. If the C/Java API is too unwieldy for your purposes, you should rely on libraries built on ZooKeeper such as cages and especially curator.

除了官方文档之外,这还不错,我建议您阅读

Official documentation apart, which is pretty good, I suggest to read Chapter 14 of Hadoop: The Definitive Guide which has ~35 pages explaining essentially what ZooKeeper does, followed by an example of a configuration service.

这篇关于解释Apache ZooKeeper的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆