将分片从一个 bigcouch 服务器移动到另一个(用于平衡) [英] moving a shard from one bigcouch server to another (for balancing)

查看:13
本文介绍了将分片从一个 bigcouch 服务器移动到另一个(用于平衡)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在测试 bigcouch 以获取大量数据(每天 1500 万条记录).

I'm currently testing bigcouch for big amounts of data (15 million records daily).

当我需要生成数据视图时,我遇到了一些平衡问题,因为我的两台机器中的一台比另一台弱得多.结果是,好的机器完成了,无事可做,而较弱的机器还有很多事情要做.(单核与双核)

When I need to generate views of the data, I experience some balancing problems, because one of my two machines is much weaker than the other one. The result is, that the better machine is finished and has nothing to do while the weaker one has still a lot to do. (single- vs. dualcore)

我现在的想法是将一些碎片从较弱的机器上移到另一台机器上,以便它们几乎同时完成.

My idea is now to move some shards from the weaker machine to the other one, so that they are finished at about the same time.

因此我的问题是,如何将分片从 weeker bigcouch 服务器转移到更好的服务器?

Therefore my question is, how can I move shards from the weeker bigcouch server to the better one?

感谢您的帮助 + 最好的问候!

Thank you for your help + best regards!

安迪

推荐答案

Bigcouch 分片只是 CouchDB 数据库,因此移动它们的过程非常简单.Bigcouch 的未来版本将自动执行该过程,但现在我将对其进行描述.

Bigcouch shards are simply CouchDB databases so the procedure for moving them is pretty simple. A future release of Bigcouch will automate the process but, for now, I'll just describe it.

一点背景知识将有助于解释.Bigcouch 节点正在侦听两个端口,5984 和 5986.前面的端口 5984 看起来像 CouchDB(同时是集群和容错的).后端端口 5986 直接与特定节点上的底层 CouchDB 服务器通信.您会注意到,除了数据库的分片之外,localhost:5986/_all_dbs 中还显示了两个额外的数据库.一个称为节点",您在设置集群时已经与它进行了交互.另一个称为dbs",包含每个集群数据库的文档,指定数据库的每个分片的每个副本实际所在的位置.

A little background will help ground the explanation. A Bigcouch node is listening on two ports, 5984 and 5986. The front port, 5984, looks like CouchDB (while being clustered and fault-tolerant). The back port, 5986, talks directly to the underlying CouchDB server on a particular node. You will notice that there are two extra databases shown in localhost:5986/_all_dbs besides the shards of your database. One is called 'nodes' and you have already interacted with it when you set up your cluster. The other is called 'dbs' and contains a document for each clustered database, specifying where each copy of each shard of your database actually lives.

因此,要移动分片,您需要做一些事情;

So, to move a shard, you need to do a few things;

  1. 识别分片文件.
  2. 将分片文件复制到新服务器.
  3. 告诉 Bigcouch 它的新位置.
  4. 如果需要,以复制结束.

步骤 1

在你的 Bigcouch 节点的数据目录中,你会找到这样的文件;

Step 1

In the data directory of your Bigcouch node, you will find files like this;

shards/a0000000-bfffffff/foo.1312544893.couch

shards/a0000000-bfffffff/foo.1312544893.couch

所有分片都在 shards/目录下组织,然后是范围,最后是名称后跟一个随机数.

All shards are organized under the shards/ directory, then by range, and finally the name followed by a random number.

为您的数据库选择一个文件并记住其名称.

Select one of the files for your database and remember its name.

使用任何方法将此文件复制到目标服务器上的相同路径.rsync 和 scp 是不错的选择,CouchDB 复制也是如此(确保从端口 5986 复制到端口 5986).

Use any method to copy this file to the same path on your target server. rsync and scp are fine choices, as is CouchDB replication (be sure to replicate from port 5986 to port 5986).

需要修改dbs"中管理集群数据库布局的文档.有点像这样;

The document in 'dbs' that governs the layout of your clustered database needs to be modified. It looks a bit like this;

{_id":baz",_rev":1-912fe2dd63e0a570a4ceb26fd742dffd",shard_suffix":[46,49,51,49,50,53,52,53,50,49,55],更改日志":[[添加",00000000-7fffffff",dev1@127.0.0.1"],[添加",80000000-ffffffff","dev1@127.0.0.1"]],"by_node":{"dev1@127.0.0.1":["00000000-7fffffff","80000000-ffffffff"]},"by_range":{";00000000-7fffffff":["dev1@127.0.0.1"],"80000000-ffffffff":["dev1@127.0.0.1"]}}

{"_id":"baz","_rev":"1-912fe2dd63e0a570a4ceb26fd742dffd","shard_suffix": [46,49,51,49,50,53,52,53,50,49,55],"changelog":[["add","00000000-7fffffff","dev1@127.0.0.1"],["add","80000000-ffffffff","dev1@127.0.0.1"]],"by_node":{"dev1@127.0.0.1":["00000000-7fffffff","80000000-ffffffff"]},"by_range":{"00000000-7fffffff":["dev1@127.0.0.1"],"80000000-ffffffff":["dev1@127.0.0.1"]}}

更新 by_node 和 by_range 值,以便您移动的分片解析到新主机.

Update both the by_node and by_range values so that the shard you have moved resolves to the new host.

此时您已经移动了分片.但是,如果在您开始复制文件后但在更新dbs"文档之前有更新,则这些写入发生在原始节点并且不可见,因此您应该继续执行第 4 步.如果没有更新,您可以删除原始服务器上的分片,但我建议您检查端口 5984 上的数据库以确保您的所有文档都正确显示.

At this point you have moved the shard. However, if there have been updates since you started copying the file but before you updated the 'dbs' document, those writes happened at the original node and are not visible so you should proceed to step 4. If there have been no updates, you can delete the shard on the original server, though I recommend you check your database on port 5984 to be sure all your docs show up correctly.

执行从源分片到目标分片的复制,再次注意在每个分片的 5986 端口上执行此操作.这将确保所有更新再次可用.您现在可以在原始服务器上删除此分片的副本.

Perform a replication from the source shard to the target shard, again taking care to do this on the 5986 port of each. This will ensure that all updates are available once again. You can now delete the copy of this shard on the original server.

HTH,Robert Newson - Cloudant.

HTH, Robert Newson - Cloudant.

这篇关于将分片从一个 bigcouch 服务器移动到另一个(用于平衡)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆