如何实现MongoDB(或CouchDB)到许多远程客户端的实时复制 [英] How to implement real-time replication of MongoDB (or CouchDB) to many remote clients

查看:147
本文介绍了如何实现MongoDB(或CouchDB)到许多远程客户端的实时复制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在考虑如何设计一个机制,以一次复制(潜在大)MongoDB或其他NoSQL(CouchDB等)数据库到几十个客户端。客户端将像复制集一样运行,但复制将是单向的,并且远程客户端将属于其他方。具体来说,我正在寻找以下功能:




  • 实时:对主数据库的更改
  • :新客户端必须能够连接,自动同步大多数现有数据,然后接收实时更新。
  • 高效:初始同步/传输数据和跟踪实时更新(diffs,如果愿意)
  • 安全:主数据库提供了一个接口,远程客户端(不属于同一所有者或系统) )可以连接:即,我们不能只将所有客户端添加到主控副本集。
  • 鲁棒:客户端和主数据库之间的临时连接失败


在某种意义上,服务器发布集合的数据,客户​​端订阅。我意识到这是一个硬的软件工程问题,据我所知,没有软件已经实现了这一点。





请指出已经实现其某些部分的软件或软件,或者需要的算法/数据结构建议

解决方案

如果您专门寻找实时复制,我建议您仔细研究SaaS产品为此目的,例如 https://www.firebase.com/


I'm considering how to design a mechanism for replicating a (potentially large) MongoDB or other NoSQL (CouchDB, etc) database to dozens of clients at once. The clients would function like a replica set, but the replication would be one-way and the remote clients would belong to other parties. Specifically, I am looking for the following features:

  • real-time: changes to the master database should be pushed out to the clients as quickly as possible
  • replication to new clients: a new client must be able to connect, automatically sync the majority of existing data, then receive real-time updates.
  • efficient: both the initial synchronization/transfer of data and tracking of real-time updates ("diffs", if you will) are computationally efficient, with multiple clients connected.
  • secure: the master database presents an interface to which remote clients (who do not belong to the same owner or system) can connect: i.e., we cannot just add all the clients to the master's replica set.
  • robust: a temporarily connection failure between a client and the master database should be easily and efficiently recoverable.

In some sense, the server is publishing a collection of data and the clients are subscribing to it. I realize that this is a hard software engineering problem, and to my knowledge no piece of software has implemented this exactly yet. However, some approaches have come to mind as close, which I'll list below.

  • Meteor's DDP protocol: It's designed to do this with Mongo-like collections and exactly implements the model of publishing and subscribing to a set of data (rather than a stream of messages). It manages the initial sync and sends along live changes. However, it's still in development, and far from being an industrial-strength solutions - current drawbacks are that the server keeps a copy of every client's state in a possibly inefficient way and is only tested on collections that can fit in the memory of a web app. Also, it appears that DDP cannot efficiently sync an out-of-date database without fetching everything from scratch. If anyone can point to some examples of how large of a collection can be synced over DDP, that would be great. (See also: Documentation or code details on Meteor's DDP pub/sub protocol?)

  • Broadcasting the Mongo oplog: Using a high-throughput message bus like Apache Kafka, one may be able to efficiently send the oplog to many clients at once. This tackles some of the system implementation challenges. However, this requires that the clients start with an initial sync that gets them close enough to the current master state somehow and then start replaying the oplog from the appropriate point.

  • Continuous replication a la CouchDB: I'm not sure how this is implemented and how robust it is, given the sparsity of the documentation. However, it does seem to work over remote database connections. How efficient is this, though, when multiple clients are trying to replicate at the same time? (A similar hack to this would be to make the clients MongoDB Priority 0 replica set members; however, that seems to be far from its intended use. See also: http://guide.couchdb.org/draft/replication.html)

Please give pointers to software or pieces of software that already implement parts of this, or suggestions on the algorithms/data structures needed to do this efficiently.

解决方案

If you are looking specifically for real-time replication, I'd recommend you look into SaaS offerings specifically for this purpose, such as https://www.firebase.com/

这篇关于如何实现MongoDB(或CouchDB)到许多远程客户端的实时复制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆