Zookeeper vs 内存数据网格 vs Redis [英] Zookeeper vs In-memory-data-grid vs Redis

查看:29
本文介绍了Zookeeper vs 内存数据网格 vs Redis的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在多个资源中发现了不同的 zookeeper 定义.也许其中一些是断章取义的,但请看看它们:

I've found different zookeeper definitions across multiple resources. Maybe some of them are taken out of context, but look at them pls:

Zookeeper 使用的一个典型例子是分布式内存计算...

  • ZooKeeper 是一个开源 Apache™ 项目,提供集中式基础架构和支持跨集群同步的服务.

  • Apache ZooKeeper 是一个开源文件应用程序接口 (API),它允许大型系统中的分布式进程相互同步,以便所有发出请求的客户端都收到一致的数据.

  • 我使用过 Redis 和 Hazelcast,通过与它们进行比较,我会更容易理解 Zookeeper.

    I've worked with Redis and Hazelcast, that would be easier for me to understand Zookeeper by comparing it with them.

    您能否将 Zookeeper 与内存数据网格和 Redis 进行比较?

    1. 如果是分布式内存计算,zookeeper 与内存数据网格有何不同?
    2. 如果跨集群同步,它与所有其他内存存储有何不同?相同的内存数据网格也提供集群范围的锁.Redis 也有某种事务.
    3. 如果只是关于内存中一致的数据,那么还有其他选择.Imdg 可以让您实现相同的目标,不是吗?

    推荐答案

    https://zookeeper.apache.org/doc/current/zookeeperOver.html

    默认情况下,Zookeeper 会将您的所有数据复制到每个节点,并让客户端观察数据的变化.更改会很快(在有限的时间内)发送到客户端.您还可以创建临时节点",如果客户端断开连接,这些节点将在指定时间内删除.ZooKeeper 针对读取进行了高度优化,而写入非常缓慢(因为它们通常在写入发生后立即发送到每个客户端).最后,Zookeeper 中文件"(znode)的最大大小为 1MB,但通常它们是单个字符串.

    By default, Zookeeper replicates all your data to every node and lets clients watch the data for changes. Changes are sent very quickly (within a bounded amount of time) to clients. You can also create "ephemeral nodes", which are deleted within a specified time if a client disconnects. ZooKeeper is highly optimized for reads, while writes are very slow (since they generally are sent to every client as soon as the write takes place). Finally, the maximum size of a "file" (znode) in Zookeeper is 1MB, but typically they'll be single strings.

    综合起来,这意味着zookeeper并不是要存储大量数据,绝对不是缓存.相反,它用于管理心跳/了解哪些服务器在线、存储/更新配置以及可能的消息传递(尽管如果您有大量消息或高吞吐量需求,RabbitMQ 之类的东西会更适合此任务).

    Taken together, this means that zookeeper is not meant to store for much data, and definitely not a cache. Instead, it's for managing heartbeats/knowing what servers are online, storing/updating configuration, and possibly message passing (though if you have large #s of messages or high throughput demands, something like RabbitMQ will be much better for this task).

    基本上,ZooKeeper(以及基于它构建的 Curator)有助于处理集群机制——心跳、分发更新/配置、分布式锁等.

    Basically, ZooKeeper (and Curator, which is built on it) helps in handling the mechanics of clustering -- heartbeats, distributing updates/configuration, distributed locks, etc.

    它不能与 Redis 相提并论,但对于具体问题...

    It's not really comparable to Redis, but for the specific questions...

    1. 它不支持任何计算,对于大多数数据集,将无法以任何性能存储数据.

    1. It doesn't support any computation and for most data sets, won't be able to store the data with any performance.

    它被复制到集群中的所有节点(没有像 Redis 集群那样可以分发数据).所有消息都被完全原子地处理并被排序,因此没有真正的交易.它可以用于为您的服务实现集群范围的锁(实际上它非常擅长),并且在 znode 本身上有很多锁定原语来控制哪些节点访问它们.

    It's replicated to all nodes in the cluster (there's nothing like Redis clustering where the data can be distributed). All messages are processed atomically in full and are sequenced, so there's no real transactions. It can be USED to implement cluster-wide locks for your services (it's very good at that in fact), and tehre are a lot of locking primitives on the znodes themselves to control which nodes access them.

    当然可以,但 ZooKeeper 填补了一个空白.它是一种使分布式应用程序在多个实例中运行良好的工具,而不是用于存储/共享大量数据.与为此目的使用 IMDG 相比,Zookeeper 将更快,以可预测的方式管理心跳和同步(有很多 API 使这部分变得容易),并且具有推"范式而不是拉",因此节点是快速通知更改.

    Sure, but ZooKeeper fills a niche. It's a tool for making a distributed applications play nice with multiple instances, not for storing/sharing large amounts of data. Compared to using an IMDG for this purpose, Zookeeper will be faster, manages heartbeats and synchronization in a predictable way (with a lot of APIs for making this part easy), and has a "push" paradigm instead of "pull" so nodes are notified very quickly of changes.

    来自链接问题的引用...

    The quotation from the linked question...

    Zookeeper 使用的一个典型例子是分布式内存计算

    A canonical example of Zookeeper usage is distributed-memory computation

    ...是,IMO,有点误导.您将使用它来编排计算,而不是提供数据.例如,假设您必须处理表的第 1-100 行.您可能会放置 10 个 ZK 节点,名称为1-10"、11-20"、21-30"等.ZK 将自动通知客户端应用程序此更改,第一个将获取"1-10" 并设置一个临时节点 clients/192.168.77.66/processing/rows_1_10

    ... is, IMO, a bit misleading. You would use it to orchestrate the computation, not provide the data. For example, let's say you had to process rows 1-100 of a table. You might put 10 ZK nodes up, with names like "1-10", "11-20", "21-30", etc. Client applications would be notified of this change automatically by ZK, and the first one would grab "1-10" and set an ephemeral node clients/192.168.77.66/processing/rows_1_10

    下一个应用程序会看到这一点并转到下一组进行处理.要计算的实际数据将存储在其他地方(即 Redis、SQL 数据库等).如果节点在计算中途失败,另一个节点可以看到这一点(30-60 秒后)并再次接手任务.

    The next application would see this and go for the next group to process. The actual data to compute would be stored elsewhere (ie Redis, SQL database, etc). If the node failed partway through the computation, another node could see this (after 30-60 seconds) and pick up the job again.

    不过,我认为 ZooKeeper 的典型例子是领导者选举.假设您有 3 个节点——一个是主节点,另外两个是从节点.如果主节点宕机,一个从节点必须成为新的领导节点.这种东西很适合ZK.

    I'd say the canonical example of ZooKeeper is leader election, though. Let's say you have 3 nodes -- one is master and the other 2 are slaves. If the master goes down, a slave node must become the new leader. This type of thing is perfect for ZK.

    这篇关于Zookeeper vs 内存数据网格 vs Redis的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆