多个 AWS 实例中的 MongoDB 负载平衡 [英] MongoDB load balancing in multiple AWS instances

查看:15
本文介绍了多个 AWS 实例中的 MongoDB 负载平衡的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们将 amazon web 服务用于业务应用程序,该应用程序使用 node.js 服务器和 mongodb 作为数据库.目前,node.js 服务器正在 EC2 中型实例上运行.我们将 mongodb 数据库保存在一个单独的微实例中.现在我们想在我们的 mongodb 数据库中部署副本集,这样如果 mongodb 被锁定或不可用,我们仍然可以运行我们的数据库并从中获取数据.

We're using amazon web service for a business application which is using node.js server and mongodb as database. Currently the node.js server is runing on a EC2 medium instance. And we're keeping our mongodb database in a separate micro instance. Now we want to deploy replica set in our mongodb database, so that if the mongodb gets locked or unavailble, we still can run our database and get data from it.

所以我们试图将副本集的每个成员保存在单独的实例中,这样即使主成员的实例关闭,我们也可以从数据库中获取数据.

So we're trying to keep each member of the replica set in separate instances, so that we can get data from the database even if the instance of the primary memeber shuts down.

现在,我想在数据库中添加负载均衡器,以便数据库即使在一次巨大的流量负载下也能正常工作.在这种情况下,我可以通过在副本集中添加 slaveOK 配置来读取平衡数据库.但是如果数据库中写操作的流量负载很大,它不会对数据库进行负载均衡.

Now, I want to add load balancer in the database, so that the database works fine even in huge traffic load at a time. In that case I can read balance the database by adding slaveOK config in the replicaSet. But it'll not load balance the database if there is huge traffic load for write operation in the database.

为了解决这个问题,到目前为止我有两个选择.

To solve this problem I got two options till now.

选项 1:我必须对数据库进行分片并将每个分片保存在单独的实例中.并且在每个分片下都会有一个副本集在同一个实例中.但是有一个问题,由于分片将数据库分成多个部分,因此每个分片不会在其中保留相同的数据.因此,如果一个实例关闭,我们将无法从该实例中的分片访问数据.

Option 1: I've to shard the database and keep each shard in separate instance. And under each shard there will be a reaplica set in the same instance. But there is a problem, as the shard divides the database in multiple parts, so each shard will not keep same data within it. So if one instance shuts down, we'll not be able to access the data from the shard within that instance.

为了解决这个问题,我试图将数据库分成多个分片,每个分片在不同的实例中都有一个副本集.因此,即使一个实例关闭,我们也不会遇到任何问题.但是如果我们有 2 个分片并且每个分片在 replicaSet 中有 3 个成员,那么我需要 6 个 aws 实例.所以我认为这不是最佳解决方案.

To solve this problem I'm trying to divide the database in shards and each shard will have a replicaSet in separate instances. So even if one instance shuts down, we'll not face any problem. But if we've 2 shards and each shard has 3 members in the replicaSet then I need 6 aws instances. So I think it's not the optimal solution.

选项 2:我们可以在 mongodb 中创建一个 master-master 配置,这意味着所有数据库都将是主数据库并且都将具有读/写访问权限,但我也希望它们彼此自动同步很多时候,所以他们最终都是彼此的克隆.所有这些主数据库都将位于单独的实例中.但是不知道mongodb是否支持这种结构.

Option 2: We can create a master-master configuration in the mongodb, that means all the database will be primary and all will have read/write access, but I would also like them to auto-sync with each other every so often, so they all end up being clones of each other. And all these primary databases will be in separate instance. But I don't know whether mongodb supports this structure or not.

对于这种情况,我没有任何 mongodb 文档/博客.所以,请建议我什么应该是这个问题的最佳解决方案.

I've not got any mongodb doc/ blog for this situation. So, please suggest me what should be the best solution for this problem.

推荐答案

到目前为止,这不会是一个完整的答案,有太多细节,我可以像其他许多人一样写一篇关于这个问题的整篇文章,因为我没有那种空闲时间,我会添加一些我所看到的评论.

This won't be a complete answer by far, there is too many details and I could write an entire essay about this question as could many others however, since I don't have that kind of time to spare, I will add some commentary about what I see.

现在,我想在数据库中添加负载均衡器,以便数据库即使在一次巨大的流量负载下也能正常工作.

Now, I want to add load balancer in the database, so that the database works fine even in huge traffic load at a time.

副本集不是为了那样工作而设计的.如果您希望负载平衡,您实际上可能正在寻找允许您执行此操作的分片.

Replica sets are not designed to work like that. If you wish to load balance you might in fact be looking for sharding which will allow you to do this.

复制用于自动故障转移.

Replication is for automatic failover.

在这种情况下,我可以通过在 replicaSet 中添加 slaveOK 配置来读取平衡数据库.

In that case I can read balance the database by adding slaveOK config in the replicaSet.

因为为了保持最新状态,您的成员将获得与主要成员一样多的操作,所以这似乎没有太大帮助.

Since, to stay up to date, your members will be getting just as many ops as the primary it seems like this might not help too much.

实际上,不是让一台服务器有许多连接排队,而是在许多服务器上有许多连接排队等待陈旧数据,因为成员一致性是最终的,不像 ACID 技术那样是即时的,但是,据说它们最终只有 32-奇数毫秒,这意味着如果加载主服务器,它们的滞后时间不足以提供不错的吞吐量.

In reality instead of having one server with many connections queued you have many connections on many servers queueing for stale data since member consistency is eventual, not immediate unlike ACID technologies, however, that being said they are only eventually consistent by 32-odd ms which means they are not lagging enough to give decent throughput if the primary is loaded.

由于读取是并发的,无论您是从主读取还是辅助读取,您都将获得相同的速度.我想你可以延迟一个 slave 来创建 OP 的暂停,但这会带来大量陈旧的数据作为回报.

Since reads ARE concurrent you will get the same speed whether you are reading from the primary or secondary. I suppose you could delay a slave to create a pause of OPs but that would bring back massively stale data in return.

更不用说 MongoDB 不是多主节点,因此您一次只能写入一个节点,这使得 slaveOK 不再是世界上最有用的设置,而且我已经多次看到 10gen 自己建议您使用分片这个设置.

Not to mention that MongoDB is not multi-master as such you can only write to one node a time makes slaveOK not the most useful setting in the world any more and I have seen numerous times where 10gen themselves recommend you use sharding over this setting.

方案二:我们可以在mongodb中创建master-master配置,

Option 2: We can create a master-master configuration in the mongodb,

这需要您自己编写代码.此时您可能需要考虑实际使用支持 http://en.wikipedia 的数据库.org/wiki/Multi-master_replication

This would require you own coding. At which point you may want to consider actually using a database that supports http://en.wikipedia.org/wiki/Multi-master_replication

这是因为您正在寻找的速度实际上很可能是写入而不是我上面讨论的读取速度.

This is since the speed you are looking for is most likely in fact in writes not reads as I discussed above.

选项 1:我必须对数据库进行分片并将每个分片保存在单独的实例中.

Option 1: I've to shard the database and keep each shard in separate instance.

这是推荐的方式,但您已经找到了警告.不幸的是,这是多主复制应该解决的仍未解决的问题,但是,多主复制确实将自己的瘟疫鼠船添加到欧洲本身,我强烈建议您在考虑是否要进行一些认真的研究之前MongoDB 目前无法满足您的需求.

This is the recommended way but you have found the caveat with it. This is unfortunately something that remains unsolved that multi-master replication is supposed to solve, however, multi-master replication does add its own ship of plague rats to Europe itself and I would strongly recommend you do some serious research before you think as to whether MongoDB cannot currently service your needs.

您可能真的什么都不担心,因为 fsync 队列旨在处理 IO 瓶颈,这会降低您的写入速度,就像在 SQL 中一样,并且读取是并发的,因此如果您正确规划架构和工作集,您应该能够获得大量 OP.

You might be worrying about nothing really since the fsync queue is designed to deal with the IO bottleneck slowing down your writes as it would in SQL and reads are concurrent so if you plan your schema and working set right you should be able to get a massive amount of OPs.

事实上,这里有一个来自 10gen 员工的链接问题,非常好读:https://stackoverflow.com/a/17459488/383478,它显示了 MongoDB 在负载下可以实现的吞吐量.

There is in fact a linked question around here from a 10gen employee that is very good to read: https://stackoverflow.com/a/17459488/383478 and it shows just how much throughput MongoDB can achieve under load.

随着新的文档级锁定已经在 dev 分支中,它很快就会增长.

It will grow soon with the new document level locking that is already in dev branch.

这篇关于多个 AWS 实例中的 MongoDB 负载平衡的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆