第一次使用 MongoDB + Docker - 从 docker compose 设置 [英] First time with MongoDB + Docker - Set up from docker compose

查看:29
本文介绍了第一次使用 MongoDB + Docker - 从 docker compose 设置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想尝试一个我在 GitHub 上找到的 项目,所以我在 MacOS 上安装了 MongoDB,现在我试图了解如何通过目录中的 docker compose 文件正确设置它.这是 docker 文件:

版本:'3'服务:# 副本集 1mongors1n1:容器名称:mongors1n1图片:蒙哥命令:mongod --shardsvr --replSet mongors1 --dbpath/data/db --port 27017端口:- 27017:27017暴露:- 27017"卷:- ~/mongo_cluster/data1:/data/dbmongors1n2:容器名称:mongors1n2图片:蒙哥命令:mongod --shardsvr --replSet mongors1 --dbpath/data/db --port 27017端口:- 27027:27017暴露:- 27017"卷:- ~/mongo_cluster/data2:/data/dbmongors1n3:容器名称:mongors1n3图片:蒙哥命令:mongod --shardsvr --replSet mongors1 --dbpath/data/db --port 27017端口:- 27037:27017暴露:- 27017"卷:- ~/mongo_cluster/data3:/data/db# 副本集 2mongors2n1:容器名称:mongors2n1图片:蒙哥命令:mongod --shardsvr --replSet mongors2 --dbpath/data/db --port 27017端口:- 27047:27017暴露:- 27017"卷:- ~/mongo_cluster/data4:/data/dbmongors2n2:容器名称:mongors2n2图片:蒙哥命令:mongod --shardsvr --replSet mongors2 --dbpath/data/db --port 27017端口:- 27057:27017暴露:- 27017"卷:- ~/mongo_cluster/data5:/data/dbmongors2n3:容器名称:mongors2n3图片:蒙哥命令:mongod --shardsvr --replSet mongors2 --dbpath/data/db --port 27017端口:- 27067:27017暴露:- 27017"卷:- ~/mongo_cluster/data6:/data/db# mongo 配置服务器mongocfg1:容器名称:mongocfg1图片:蒙哥命令:mongod --configsvr --replSet mongors1conf --dbpath/data/db --port 27017暴露:- 27017"卷:- ~/mongo_cluster/config1:/data/dbmongocfg2:容器名称:mongocfg2图片:蒙哥命令:mongod --configsvr --replSet mongors1conf --dbpath/data/db --port 27017暴露:- 27017"卷:- ~/mongo_cluster/config2:/data/dbmongocfg3:容器名称:mongocfg3图片:蒙哥命令:mongod --configsvr --replSet mongors1conf --dbpath/data/db --port 27017暴露:- 27017"卷:- ~/mongo_cluster/config3:/data/db# mongos 路由器mongos1:容器名称:mongos1图片:蒙哥依赖于取决于:-mongocfg1-mongocfg2命令:mongos --configdb mongors1conf/mongocfg1:27017,mongocfg2:27017,mongocfg3:27017 --port 27017端口:- 27019:27017暴露:- 27017"mongos2:容器名称:mongos2图片:蒙哥依赖于取决于:-mongocfg1-mongocfg2命令:mongos --configdb mongors1conf/mongocfg1:27017,mongocfg2:27017,mongocfg3:27017 --port 27017端口:- 27020:27017暴露:- 27017"# 运行 docker-compose 后的 TODO# conf = rs.config();# conf.members[0].priority = 2;# rs.reconfig(conf);

这是运行和创建分片等的脚本:

#!/bin/sh码头工人组成# 配置我们的配置服务器副本集docker exec -it mongocfg1 bash -c "echo 'rs.initiate({_id: "mongors1conf",configsvr: true, members: [{ _id : 0, host : "mongocfg1" },{ _id : 1, 主机: "mongocfg2" }, { _id : 2, 主机: "mongocfg3" }]})' |蒙哥"# 构建副本分片docker exec -it mongors1n1 bash -c "echo 'rs.initiate({_id : "mongors1", members: [{ _id : 0, host : "mongors1n1" },{ _id : 1, 主机: "mongors1n2" },{ _id : 2, 主机: "mongors1n3" }]})' |蒙哥"docker exec -it mongors2n1 bash -c "echo 'rs.initiate({_id : "mongors2", members: [{ _id : 0, host : "mongors2n1" },{ _id : 1, 主机: "mongors2n2" },{ _id : 2, 主机: "mongors2n3" }]})' |蒙哥"# 我们将分片添加到路由器docker exec -it mongos1 bash -c "echo 'sh.addShard("mongors1/mongors1n1")' |蒙哥docker exec -it mongos1 bash -c "echo 'sh.addShard("mongors2/mongors2n1")' |蒙哥

如果我尝试直接运行脚本,我会收到错误:

<块引用>

mongos1 |{t":{$date":2021-07-25T09:03:56.101+00:00"},s":I",c":-";,id":4333222,ctx":ReplicaSetMonitor-TaskExecutor",msg":RSM收到错误响应",attr":{host":mongocfg3:27017";,"error":"HostUnreachable: Error connection to mongocfg3:27017 (172.18.0.2:27017) :: 由 :: Connection denied 引起",replicaSet":mongors1conf",response":";{}"}}

<块引用>

mongos1 |{"t":{"$date":"2021-07-25T09:03:56.101+00:00"},"s":"I",c":网络",id":4712102,ctx":ReplicaSetMonitor-TaskExecutor",msg":副本中的主机失败set"、attr":{replicaSet":mongors1conf"、host":mongocfg3:27017"、error":{code":6、codeName":";HostUnreachable","errmsg":"错误连接到 mongocfg3:27017 (172.18.0.2:27017) :: 由 :: 引起联系拒绝"},动作":{dropConnections":true,requestImmediateCheck":false,结果":{host":mongocfg3:27017",成功":false,";errorMessage":"HostUnreachable:连接到 mongocfg3:27017 (172.18.0.2:27017) :: 时出错 :: 导致连接被拒绝"}}}}

还有其他错误,例如:

<块引用>

mongos1 |{"t":{"$date":"2021-07-25T09:05:39.743+00:00"},"s":"I",c":-",id":4939300,ctx":monitoring-keys-for-HMAC",msg":刷新密钥失败缓存",attr":{错误":FailedToSatisfyReadPreference:找不到主机匹配读取偏好{模式:最近"} 用于设置mongors1conf","nextWakeupMillis":1800}}

不应该 docker 配置所有文件而不需要用户吗?还是我需要手动创建一些东西,比如数据库等?

这是我运行脚本时出现的第一个错误:log

解决方案

所以这里是一个帮助的尝试.在大多数情况下,docker compose yaml 文件非常接近,除了一些小的端口和绑定参数.期望初始化将是附加命令.示例:

  1. docker-组合环境
  2. 运行一些脚本来初始化环境

...但这已经是原始帖子的一部分.

所以这里是一个 docker compose 文件

docker-compose.yml

版本:'3'服务:# mongo 配置服务器mongocfg1:容器名称:mongocfg1主机名:mongocfg1图片:蒙哥命令:mongod --configsvr --replSet mongors1conf --dbpath/data/db --port 27019 --bind_ip_all卷:- ~/mongo_cluster/config1:/data/dbmongocfg2:容器名称:mongocfg2主机名:mongocfg2图片:蒙哥命令:mongod --configsvr --replSet mongors1conf --dbpath/data/db --port 27019 --bind_ip_all卷:- ~/mongo_cluster/config2:/data/dbmongocfg3:容器名称:mongocfg3主机名:mongocfg3图片:蒙哥命令:mongod --configsvr --replSet mongors1conf --dbpath/data/db --port 27019 --bind_ip_all卷:- ~/mongo_cluster/config3:/data/db# 副本集 1mongors1n1:容器名称:mongors1n1主机名:mongors1n1图片:蒙哥命令:mongod --shardsvr --replSet mongors1 --dbpath/data/db --port 27018 --bind_ip_all卷:- ~/mongo_cluster/data1:/data/dbmongors1n2:容器名称:mongors1n2主机名:mongors1n2图片:蒙哥命令:mongod --shardsvr --replSet mongors1 --dbpath/data/db --port 27018 --bind_ip_all卷:- ~/mongo_cluster/data2:/data/dbmongors1n3:容器名称:mongors1n3主机名:mongors1n3图片:蒙哥命令:mongod --shardsvr --replSet mongors1 --dbpath/data/db --port 27018 --bind_ip_all卷:- ~/mongo_cluster/data3:/data/db# 副本集 2mongors2n1:容器名称:mongors2n1主机名:mongors2n1图片:蒙哥命令:mongod --shardsvr --replSet mongors2 --dbpath/data/db --port 27018 --bind_ip_all卷:- ~/mongo_cluster/data4:/data/dbmongors2n2:容器名称:mongors2n2主机名:mongors2n2图片:蒙哥命令:mongod --shardsvr --replSet mongors2 --dbpath/data/db --port 27018 --bind_ip_all卷:- ~/mongo_cluster/data5:/data/dbmongors2n3:容器名称:mongors2n3主机名:mongors2n3图片:蒙哥命令:mongod --shardsvr --replSet mongors2 --dbpath/data/db --port 27018 --bind_ip_all卷:- ~/mongo_cluster/data6:/data/db# mongos 路由器mongos1:容器名称:mongos1主机名:mongos1图片:蒙哥依赖于取决于:-mongocfg1-mongocfg2命令:mongos --configdb mongors1conf/mongocfg1:27019,mongocfg2:27019,mongocfg3:27019 --port 27017 --bind_ip_all端口:- 27017:27017mongos2:容器名称:mongos2主机名:mongos2图片:蒙哥依赖于取决于:-mongocfg1-mongocfg2命令:mongos --configdb mongors1conf/mongocfg1:27019,mongocfg2:27019,mongocfg3:27019 --port 27017 --bind_ip_all端口:- 27016:27017

...以及一些用于完成初始化的脚本...

docker-compose up -d

...给它几秒钟结束,然后发出...

# 初始化副本集(使用 MONGOS 主机)docker exec -it mongos1 bash -c "echo 'rs.initiate({_id: "mongors1conf",configsvr: true, members: [{ _id : 0, host : "mongocfg1:27019";, 优先级: 2 },{ _id : 1, 主机: "mongocfg2:27019" }, { _id : 2, 主机: "mongocfg3:27019" }]})' |mongo --host mongocfg1:27019"docker exec -it mongos1 bash -c "echo 'rs.initiate({_id : "mongors1", 成员: [{ _id : 0, host : "mongors1n1:27018", 优先级: 2},{ _id : 1, 主机: "mongors1n2:27018" },{ _id : 2, 主机: "mongors1n3:27018" }]})' |mongo --host mongors1n1:27018";docker exec -it mongos1 bash -c "echo 'rs.initiate({_id : "mongors2", 成员: [{ _id : 0, host : "mongors2n1:27018", 优先级: 2},{ _id : 1, 主机: "mongors2n2:27018" },{ _id : 2, 主机: "mongors2n3:27018" }]})' |mongo --host mongors2n1:27018"

...再次,给 10-15 秒让系统调整到最近的命令...

# 添加两个分片(mongors1 和 mongors2)docker exec -it mongos1 bash -c "echo 'sh.addShard("mongors1/mongors1n1:27018,mongors1n2:27018,mongors1n2:27018")' |蒙哥"docker exec -it mongos1 bash -c "echo 'sh.addShard("mongors2/mongors2n1:27018,mongors2n2:27018,mongors2n3:27018")' |蒙哥"

现在,尝试从运行 docker 的主机连接到 mongos(假设您在此主机上安装了 mongo shell).使用 2 个 mongos 主机作为种子列表.

mongo --host "localhost:27017,localhost:27016"

评论

注意 node0 的优先级是如何在 init() 调用中设置为优先级 2 的?

注意配置服务器都是端口 27019 - 这遵循 MongoDB 的建议.

注意分片服务器都是端口 27018 - 再次遵循 mongo 的建议.

mongos 公开 2 个端口 27017(MongoDB 的自然端口)和端口 27016(用于高可用性的辅助 mongos).

出于安全原因,配置服务器和分片服务器不会公开它们各自的端口.应该使用 mongos 来获取数据.如果出于管理目的需要打开这些端口,只需添加到 docker compose 文件中即可.

副本集互通未使用身份验证.这是一个安全禁忌.需要确定哪种身份验证机制最适合您的方案 - 可以使用密钥文件(只是副本集成员之间相同的文本文件)或 x509 证书.如果使用 x509,那么您需要在每个 docker 容器中包含 CA.cert 以供参考,以及每个服务器的单个证书以及正确的主机名对齐.需要为 mongod 进程添加启动配置项以使用选择的任何身份验证方法.

未指定日志记录.将 mongod 和 mongos 的日志输出设置为/var/log/mongodb/mongod.log 和/var/log/mongodb/mongos.log 的默认位置可能是有意义的.在没有指定日志记录策略的情况下,我相信 mongo 会记录到标准输出,如果运行 docker-compose up -d,则会被抑制.

超级用户:系统上尚未创建任何用户.通常,对于每个副本集,我都会在将其添加到分片集群之前站起来,我喜欢添加一个超级用户帐户 - 一个具有 root 访问权限的帐户 - 所以如果我需要在副本集级别进行管理更改,我可以.使用 docker-compose 方法,您可以从 mongos 的角度创建一个超级用户,并在分片集群上执行大部分所需的操作,但我仍然喜欢让副本集用户可用.

OS 可调参数 - Mongo 喜欢占用所有系统资源.对于一个物理主机托管一堆 mongo 进程的共享生态系统,您可能需要考虑指定 WiredTiger 缓存大小等. WiredTiger 默认需要 (System Memory Size - 1 GB)/2. 此外,您将从将 ulimits 设置为适当的值中受益 - 即,每个用户 64000 个文件句柄是一个好的开始 - mongo 可能喜欢使用大量文件.此外,文件系统应该安装在有 xfs 的地方.此策略使用主机系统用户主目录作为数据库数据目录.此处欢迎采用更周到的方法.

还有别的吗?

我确定我遗漏了一些东西.有什么问题可以留言,我会回复的.

更新 1

上面的 docker-compose.yml 文件缺少某些主机的主机名属性,这导致了平衡器问题,因此我编辑了 docker-compose.yml 以在所有主机上包含主机名.

另外,addShard() 方法只引用了副本集的一个主机.为了完整起见,我将其他主机添加到上述 addShard() 方法中.

按照这些步骤将产生一个全新的分片集群,但还没有用户数据库.因此,没有用户数据库被分片.所以让我们花点时间添加一个数据库并对其进行分片,然后查看分片分布(A.K.A.,平衡器结果).

我们必须通过 mongos 连接到数据库(如上所述).此示例假定使用 mongo shell.

mongo --host "localhost:27017,localhost:27016"

Mongo 中的数据库可以通过多种方式创建.虽然没有明确的数据库创建命令,但有一个明确的创建集合命令(db.createCollection()).我们必须首先使用'use'命令设置数据库上下文...

使用我的数据库db.createCollection("mycollection")

...但我们可以通过在不存在的集合上创建索引来创建数据库和集合,而不是使用此命令.(如果您已经创建了集合,不用担心,下一个命令应该仍然成功).

使用我的数据库db.mycollection.createIndex({lastName: 1, creationDate: 1})

在这个例子中,我在两个字段上创建了一个复合索引...

  • 姓氏
  • 创作日期

... 在尚不存在的集合上,在尚不存在的数据库上.一旦我发出这个命令,数据库和集合都将被创建.此外,我现在有了分片键的基础——分片分发将基于的键.此分片键将基于具有这两个字段的新索引.

对数据库进行分片

假设我已经发出了 createIndex 命令,我现在可以在数据库上打开分片并发出 shardCollection 命令...

sh.enableSharding("mydatabase")sh.shardCollection(mydatabase.mycollection",{lastName":1,creationDate":1})

注意命令shardCollection()"是如何引用我们之前创建的索引字段的?假设分片已成功应用,我们现在可以通过发出 查看数据的分布情况sh.status() 命令

sh.status()

输出示例:(新集合,还没有数据,因此没有真正的数据分布 - 需要插入超过 64MB 的数据,这样有多个块要分布)

mongos>sh.status()--- 分片状态 ---分片版本:{_id": 1,最小兼容版本": 5,当前版本": 6,集群 ID": ObjectId("6101c030a98b2cc106034695")}碎片:{_id";:mongors1",主机";:mongors1/mongors1n1:27018,mongors1n2:27018,mongors1n3:27018",状态":1,拓扑时间";: 时间戳(1627504744, 1) }{_id";:mongors2",主机";:mongors2/mongors2n1:27018,mongors2n2:27018,mongors2n3:27018",状态":1,拓扑时间";: 时间戳(1627504753, 1) }活跃的猫鼬:5.0.1": 2自动拆分:当前启用:是平衡器:当前启用:是当前运行:否最近 5 次尝试中失败的平衡器轮次:0过去 24 小时的迁移结果:最近没有迁移数据库:{_id";:配置",主要":配置",分区": 真的 }{_id";:我的数据库"、主数据库":mongors2",分区":真,版本";: { "uuid";: UUID(bc890722-00c6-4cbe-a3e1-eab9692faf93"), 时间戳":时间戳(1627504768, 2),lastMod": 1 } }我的数据库.mycollection分片键:{姓氏";:1,创建日期": 1 }唯一:假平衡:真块:mongors2 1{姓氏";: { "$minKey";: 1 },创建日期": { "$minKey";: 1 } } -->>{姓氏";: { "$maxKey";: 1 },创建日期": { "$maxKey";: 1 } } on : mongors2 Timestamp(1, 0)

插入一些数据

为了测试分片,我们可以添加一些测试数据.同样,我们希望按姓氏和创建日期进行分发.

在 mongoshell 中我们可以运行 javascript.这是一个脚本,它将创建测试记录,以便对数据进行拆分和平衡.这将创建 500,000 条虚假记录.我们需要超过 64MB 的数据来创建另一个块来平衡.500,000 条记录将产生约.5块.这需要几分钟的时间来运行和完成.

使用我的数据库函数随机整数(最小值,最大值){返回 Math.floor(Math.random() * (max - min) + min);}函数randomAlphaNumeric(长度){变种结果 = [];var 字符 = 'abcdef0123456789';var charactersLength = characters.length;for ( var i = 0; i < 长度; i++ ) {result.push(characters.charAt(Math.floor(Math.random() * charactersLength)));}返回结果.join('');}函数生成文档(){返回 {姓氏:randomAlphaNumeric(8),创建日期:新日期(),stringFixedLength: randomAlphaNumeric(8),stringVariableLength: randomAlphaNumeric(randomInteger(5, 50)),integer1: NumberInt(randomInteger(0, 2000000)),long1: NumberLong(randomInteger(0, 100000000)),日期1:新日期(),guid1: 新的 UUID()};}for (var j = 0; j <500; j++) {变量批次=[];for (var i = 0; i <1000; i++) {批处理.push({插入一个:{文档:生成文档()}});}db.mycollection.bulkWrite(batch, {ordered: false});}

花几分钟时间在 mongoshell 中查看,如果我们现在查看分片状态,我们应该会看到块分布在两个分片中...

sh.status()

...我们应该看到类似于...的东西

mongos>sh.status()--- 分片状态 ---分片版本:{_id": 1,最小兼容版本": 5,当前版本": 6,集群 ID": ObjectId("6101c030a98b2cc106034695")}碎片:{_id";:mongors1",主机";:mongors1/mongors1n1:27018,mongors1n2:27018,mongors1n3:27018",状态":1,拓扑时间";: 时间戳(1627504744, 1) }{_id";:mongors2",主机";:mongors2/mongors2n1:27018,mongors2n2:27018,mongors2n3:27018",状态":1,拓扑时间";: 时间戳(1627504753, 1) }活跃的猫鼬:5.0.1": 2自动拆分:当前启用:是平衡器:当前启用:是当前运行:是具有活动迁移的集合:config.system.sessions 开始于 2021 年 7 月 28 日星期三 20:44:25 GMT+0000 (UTC)最近 5 次尝试中失败的平衡器轮次:0过去 24 小时的迁移结果:60:成功数据库:{_id";:配置",主要":配置",分区": 真的 }config.system.sessions分片键:{_id";: 1 }唯一:假平衡:真块:mongors1 965mongors2 59打印的块太多,如果要强制打印,请使用详细{_id";:我的数据库"、主数据库":mongors2",分区":真,版本";: { "uuid";: UUID(bc890722-00c6-4cbe-a3e1-eab9692faf93"), 时间戳":时间戳(1627504768, 2),lastMod": 1 } }我的数据库.mycollection分片键:{姓氏";:1,创建日期": 1 }唯一:假平衡:真块:蒙古人1 2mongors2 3{姓氏";: { "$minKey";: 1 },创建日期": { "$minKey";: 1 } } -->>{姓氏":00001276",创建日期": ISODate(2021-07-28T20:42:00.867Z")} on : mongors1 Timestamp(2, 0){姓氏":00001276",创建日期": ISODate(2021-07-28T20:42:00.867Z")} -->>{姓氏":623292c2",创建日期": ISODate(2021-07-28T20:42:01.046Z")} on : mongors1 Timestamp(3, 0){姓氏":623292c2",创建日期": ISODate(2021-07-28T20:42:01.046Z")} -->>{姓氏":c3f2a99a",创建日期": ISODate(2021-07-28T20:42:06.474Z")} on : mongors2 Timestamp(3, 1){姓氏":c3f2a99a",创建日期": ISODate(2021-07-28T20:42:06.474Z")} -->>{姓氏":ed75c36c",创建日期": ISODate(2021-07-28T20:42:03.984Z")} on : mongors2 Timestamp(1, 6){姓氏":ed75c36c",创建日期": ISODate(2021-07-28T20:42:03.984Z")} -->>{姓氏";: { "$maxKey";: 1 },创建日期": { "$maxKey";: 1 } } on : mongors2 Timestamp(2, 1)

...在这里我们可以看到平衡活动的证据.见标签块"对于 mongors1 和 mongors2.在平衡我们的测试集合的同时,它也在为会话数据预拆分和平衡不同的集合.我相信这是一次性的系统自动化.

我希望这些细节有所帮助.如果您有任何其他问题,请告诉我.

I'd like to try a project I found on GitHub, so I installed MongoDB on MacOS and now I'm trying to understand how to set up it correctly through the docker compose file in the directory. This is the docker file:

version: '3'
services:
# replica set 1
  mongors1n1:
    container_name: mongors1n1
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27017
    ports:
      - 27017:27017
    expose:
      - "27017"
    volumes:
      - ~/mongo_cluster/data1:/data/db

  mongors1n2:
    container_name: mongors1n2
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27017
    ports:
      - 27027:27017
    expose:
      - "27017"
    volumes:
      - ~/mongo_cluster/data2:/data/db

  mongors1n3:
    container_name: mongors1n3
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27017
    ports:
      - 27037:27017
    expose:
      - "27017"

    volumes:
      - ~/mongo_cluster/data3:/data/db

# replica set 2
  mongors2n1:
    container_name: mongors2n1
    image: mongo
    command: mongod --shardsvr --replSet mongors2 --dbpath /data/db --port 27017
    ports:
      - 27047:27017
    expose:
      - "27017"
    volumes:
      - ~/mongo_cluster/data4:/data/db

  mongors2n2:
    container_name: mongors2n2
    image: mongo
    command: mongod --shardsvr --replSet mongors2 --dbpath /data/db --port 27017
    ports:
      - 27057:27017
    expose:
      - "27017"
    volumes:
      - ~/mongo_cluster/data5:/data/db

  mongors2n3:
    container_name: mongors2n3
    image: mongo
    command: mongod --shardsvr --replSet mongors2 --dbpath /data/db --port 27017
    ports:
      - 27067:27017
    expose:
      - "27017"

    volumes:
      - ~/mongo_cluster/data6:/data/db

  # mongo config server
  mongocfg1:
    container_name: mongocfg1
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27017
    expose:
      - "27017"
    volumes:
      - ~/mongo_cluster/config1:/data/db

  mongocfg2:
    container_name: mongocfg2
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27017
    expose:
      - "27017"
    volumes:
      - ~/mongo_cluster/config2:/data/db

  mongocfg3:
    container_name: mongocfg3
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27017

    expose:
      - "27017"
    volumes:
      - ~/mongo_cluster/config3:/data/db

# mongos router
  mongos1:
    container_name: mongos1
    image: mongo
    depends_on:
      - mongocfg1
      - mongocfg2
    command: mongos --configdb mongors1conf/mongocfg1:27017,mongocfg2:27017,mongocfg3:27017 --port 27017
    ports:
      - 27019:27017
    expose:
      - "27017"

  mongos2:
    container_name: mongos2
    image: mongo
    depends_on:
      - mongocfg1
      - mongocfg2
    command: mongos --configdb mongors1conf/mongocfg1:27017,mongocfg2:27017,mongocfg3:27017 --port 27017
    ports:
      - 27020:27017
    expose:
      - "27017"


# TODO after running docker-compose
# conf = rs.config();
# conf.members[0].priority = 2;
# rs.reconfig(conf);

And this is the script to run and create the shards etc..:

#!/bin/sh
docker-compose up
# configure our config servers replica set
docker exec -it mongocfg1 bash -c "echo 'rs.initiate({_id: "mongors1conf",configsvr: true, members: [{ _id : 0, host : "mongocfg1" },{ _id : 1, host : "mongocfg2" }, { _id : 2, host : "mongocfg3" }]})' | mongo"

# building replica shard
docker exec -it mongors1n1 bash -c "echo 'rs.initiate({_id : "mongors1", members: [{ _id : 0, host : "mongors1n1" },{ _id : 1, host : "mongors1n2" },{ _id : 2, host : "mongors1n3" }]})' | mongo"
docker exec -it mongors2n1 bash -c "echo 'rs.initiate({_id : "mongors2", members: [{ _id : 0, host : "mongors2n1" },{ _id : 1, host : "mongors2n2" },{ _id : 2, host : "mongors2n3" }]})' | mongo"


# we add shard to the routers
docker exec -it mongos1 bash -c "echo 'sh.addShard("mongors1/mongors1n1")' | mongo "
docker exec -it mongos1 bash -c "echo 'sh.addShard("mongors2/mongors2n1")' | mongo "

If I try to run directly the script I get the errors:

mongos1 | {"t":{"$date":"2021-07-25T09:03:56.101+00:00"},"s":"I", "c":"-", "id":4333222, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM received error response","attr":{"host":"mongocfg3:27017","error":"HostUnreachable: Error connecting to mongocfg3:27017 (172.18.0.2:27017) :: caused by :: Connection refused","replicaSet":"mongors1conf","response":"{}"}}

mongos1 | {"t":{"$date":"2021-07-25T09:03:56.101+00:00"},"s":"I", "c":"NETWORK", "id":4712102, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Host failed in replica set","attr":{"replicaSet":"mongors1conf","host":"mongocfg3:27017","error":{"code":6,"codeName":"HostUnreachable","errmsg":"Error connecting to mongocfg3:27017 (172.18.0.2:27017) :: caused by :: Connection refused"},"action":{"dropConnections":true,"requestImmediateCheck":false,"outcome":{"host":"mongocfg3:27017","success":false,"errorMessage":"HostUnreachable: Error connecting to mongocfg3:27017 (172.18.0.2:27017) :: caused by :: Connection refused"}}}}

And other errors like:

mongos1 | {"t":{"$date":"2021-07-25T09:05:39.743+00:00"},"s":"I", "c":"-", "id":4939300, "ctx":"monitoring-keys-for-HMAC","msg":"Failed to refresh key cache","attr":{"error":"FailedToSatisfyReadPreference: Could not find host matching read preference { mode: "nearest" } for set mongors1conf","nextWakeupMillis":1800}}

Shouldn't docker configure all the files without the user has to? Or do I need to create something manually like the database etc.?

EDIT: Here there are the first errors that show up when I run the script: log

解决方案

So here is an attempt at helping.. For the most part, the docker compose yaml file is pretty close, with exception of some minor port and binding parameters. There is an expectation that initialization will be additional commands. Example:

  1. docker-compose up the environment
  2. run some scripts to init the environment

... but this was already part of the original post.

So here is a docker compose file

docker-compose.yml

version: '3'
services:
 # mongo config server
  mongocfg1:
    container_name: mongocfg1
    hostname: mongocfg1
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27019 --bind_ip_all
    volumes:
      - ~/mongo_cluster/config1:/data/db

  mongocfg2:
    container_name: mongocfg2
    hostname: mongocfg2
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27019 --bind_ip_all
    volumes:
      - ~/mongo_cluster/config2:/data/db

  mongocfg3:
    container_name: mongocfg3
    hostname: mongocfg3
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27019 --bind_ip_all
    volumes:
      - ~/mongo_cluster/config3:/data/db

# replica set 1
  mongors1n1:
    container_name: mongors1n1
    hostname: mongors1n1
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27018 --bind_ip_all
    volumes:
      - ~/mongo_cluster/data1:/data/db

  mongors1n2:
    container_name: mongors1n2
    hostname: mongors1n2
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27018 --bind_ip_all
    volumes:
      - ~/mongo_cluster/data2:/data/db

  mongors1n3:
    container_name: mongors1n3
    hostname: mongors1n3
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27018 --bind_ip_all
    volumes:
      - ~/mongo_cluster/data3:/data/db

# replica set 2
  mongors2n1:
    container_name: mongors2n1
    hostname: mongors2n1
    image: mongo
    command: mongod --shardsvr --replSet mongors2 --dbpath /data/db --port 27018 --bind_ip_all
    volumes:
      - ~/mongo_cluster/data4:/data/db

  mongors2n2:
    container_name: mongors2n2
    hostname: mongors2n2
    image: mongo
    command: mongod --shardsvr --replSet mongors2 --dbpath /data/db --port 27018 --bind_ip_all
    volumes:
      - ~/mongo_cluster/data5:/data/db

  mongors2n3:
    container_name: mongors2n3
    hostname: mongors2n3
    image: mongo
    command: mongod --shardsvr --replSet mongors2 --dbpath /data/db --port 27018 --bind_ip_all
    volumes:
      - ~/mongo_cluster/data6:/data/db

# mongos router
  mongos1:
    container_name: mongos1
    hostname: mongos1
    image: mongo
    depends_on:
      - mongocfg1
      - mongocfg2
    command: mongos --configdb mongors1conf/mongocfg1:27019,mongocfg2:27019,mongocfg3:27019 --port 27017 --bind_ip_all
    ports:
      - 27017:27017

  mongos2:
    container_name: mongos2
    hostname: mongos2
    image: mongo
    depends_on:
      - mongocfg1
      - mongocfg2
    command: mongos --configdb mongors1conf/mongocfg1:27019,mongocfg2:27019,mongocfg3:27019 --port 27017 --bind_ip_all
    ports:
      - 27016:27017

... and some scripts to finalize the initialization...

docker-compose up -d

... Give it a few seconds to wind up, then issue...

# Init the replica sets (use the MONGOS host)
docker exec -it mongos1 bash -c "echo 'rs.initiate({_id: "mongors1conf",configsvr: true, members: [{ _id : 0, host : "mongocfg1:27019", priority: 2 },{ _id : 1, host : "mongocfg2:27019" }, { _id : 2, host : "mongocfg3:27019" }]})' | mongo --host mongocfg1:27019"
docker exec -it mongos1 bash -c "echo 'rs.initiate({_id : "mongors1", members: [{ _id : 0, host : "mongors1n1:27018", priority: 2 },{ _id : 1, host : "mongors1n2:27018" },{ _id : 2, host : "mongors1n3:27018" }]})' | mongo --host mongors1n1:27018"
docker exec -it mongos1 bash -c "echo 'rs.initiate({_id : "mongors2", members: [{ _id : 0, host : "mongors2n1:27018", priority: 2 },{ _id : 1, host : "mongors2n2:27018" },{ _id : 2, host : "mongors2n3:27018" }]})' | mongo --host mongors2n1:27018"

... again, give 10-15 seconds to allow the system to adjust to recent commands...

# ADD TWO SHARDS (mongors1, and mongors2)
docker exec -it mongos1 bash -c "echo 'sh.addShard("mongors1/mongors1n1:27018,mongors1n2:27018,mongors1n2:27018")' | mongo"
docker exec -it mongos1 bash -c "echo 'sh.addShard("mongors2/mongors2n1:27018,mongors2n2:27018,mongors2n3:27018")' | mongo"

Now, try to connect to the mongos from the host with docker running (assumes you have mongo shell installed on this host). Use 2 mongos hosts as the seed list.

mongo --host "localhost:27017,localhost:27016"

Comments

Notice how the priority for node0 is set to a priority of 2 in the init() call?

Notice how the config servers are all port 27019 - this follows recommendations by MongoDB.

Notice how the shard servers are all port 27018 - again, following mongo recommendations.

The mongos expose 2 ports 27017 (the natural port for MongoDB) and also port 27016 (a secondary mongos for high availability).

The config servers and the shard servers do not expose their respective ports - for security reasons. Should be using the mongos to get to the data. If need to have these ports open for administrative purposes simply add to the docker compose file.

The replica-set intercommunication is not using authentication. This is a security no-no. Need to decide which auth mechanism is best for your scenario - can use keyfile (just a text file that is identical among the members of the replica set) or x509 certs. If going with x509 then you need to include the CA.cert in each docker container for reference along with the individual cert per server with proper host name alignment. Would need to add the startup configuration item for the mongod processes to use whichever auth method was selected.

Logging is not specified. It probably makes sense to set the logging output of the mongod and mongos to the default location of /var/log/mongodb/mongod.log and /var/log/mongodb/mongos.log for these. Without specifying a logging strategy I believe mongo logs to standard out, which is suppressed if running docker-compose up -d.

Superuser: No users are yet created on the system. Usually for every replica set I stand up before adding it to a sharded cluster I like to add a super user account - one having root access - so if I need to make administrative changes at the replica set level I can. With the docker-compose approach you can create a super user from the mongos perspective and perform most all operations needed on a sharded cluster, but still - I like having the replica set user available.

OS tunables - Mongo likes to take up all the system resources. For a shared ecosystem where one physical host is hosting a bunch of mongo processes, you might want to consider specifying the wiredTiger cache size, etc. WiredTiger by default wants (System Memory Size - 1 GB) / 2. Also, you would benefit from setting ulimits to proper values - i.e., 64000 file handles per user is a good start - mongo potentially likes to use a lot of files. Also, filesystem should be mounted somewhere having xfs. This strategy is using the host system users home directory for database data directories. A more thoughtful approach could be welcomed here.

Anything else?

I am sure I am missing something. If you have any questions, please leave a comment and I will reply.

Update 1

The above docker-compose.yml file was missing the hostname attribute for some of the hosts, and this was causing balancer issues, so I have edited the docker-compose.yml to include hostname on all hosts.

Also, the addShard() method only referred to one host of the replica set. For completeness I added the other hosts to the addShard() method described above.

Following these steps will result in a brand new sharded cluster, but there are no user databases yet. As such, no user databases are sharded. So let's take a moment to add a database and shard it, then view the shard distributions (A.K.A., balancer results).

We must connect to the database via the mongos (as described above). This example assumes the use of the mongo shell.

mongo --host "localhost:27017,localhost:27016"

Databases in Mongo can be created a variety of ways. While there is no explicit database create command, there is an explicit create collection command (db.createCollection()). We must first set the database context using a 'use ' command...

use mydatabase
db.createCollection("mycollection")

... but rather than use this command we can create a database and collection by creating an index on a non-existing collection. (If you already created the collection, no worries, this next command should still be successful).

use mydatabase
db.mycollection.createIndex({lastName: 1, creationDate: 1})

In this example, I created a compound index on two fields...

  • lastName
  • creationDate

... on a collection that does not yet exist, on a database that does not yet exist. Once I issue this command, both the database and the collection will be created. Furthermore, I now have the basis for a shard key - the key to which sharding distribution will be based. This shard key will be based on this new index having these two fields.

Shard the database

Assuming I have issued the createIndex command, I can now turn on sharding at the database and issue the shardCollection command...

sh.enableSharding("mydatabase")
sh.shardCollection("mydatabase.mycollection", { "lastName": 1, "creationDate": 1})

Notice how the command 'shardCollection()' refers to our indexed fields created earlier? Assuming sharding has been successfully applied, we can now view the distribution of data by issuing the sh.status() command

sh.status()

Example of output: (new collection, no data yet, thus no real distribution of data - need to insert more than 64MB of data such that there is more than one chunk to distribute)

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("6101c030a98b2cc106034695")
  }
  shards:
        {  "_id" : "mongors1",  "host" : "mongors1/mongors1n1:27018,mongors1n2:27018,mongors1n3:27018",  "state" : 1,  "topologyTime" : Timestamp(1627504744, 1) }
        {  "_id" : "mongors2",  "host" : "mongors2/mongors2n1:27018,mongors2n2:27018,mongors2n3:27018",  "state" : 1,  "topologyTime" : Timestamp(1627504753, 1) }
  active mongoses:
        "5.0.1" : 2
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled: yes
        Currently running: no
        Failed balancer rounds in last 5 attempts: 0
        Migration results for the last 24 hours: 
                No recent migrations
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
        {  "_id" : "mydatabase",  "primary" : "mongors2",  "partitioned" : true,  "version" : {  "uuid" : UUID("bc890722-00c6-4cbe-a3e1-eab9692faf93"),  "timestamp" : Timestamp(1627504768, 2),  "lastMod" : 1 } }
                mydatabase.mycollection
                        shard key: { "lastName" : 1, "creationDate" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                mongors2    1
                        { "lastName" : { "$minKey" : 1 }, "creationDate" : { "$minKey" : 1 } } -->> { "lastName" : { "$maxKey" : 1 }, "creationDate" : { "$maxKey" : 1 } } on : mongors2 Timestamp(1, 0) 

Insert some data

To test out the sharding we can add some test data. Again, we want to distribute by lastName, and creationDate.

In mongoshell we can run javascript. Here is a script that will create test records such that data will be split and balanced. This will create 500,000 fake records. We need more than 64MB of data to create another chunk to balance. 500,000 records will make approx. 5 chunks. This takes a couple of minutes to run and complete.

use mydatabase

function randomInteger(min, max) {
    return Math.floor(Math.random() * (max - min) + min);
} 

function randomAlphaNumeric(length) {
  var result = [];
  var characters = 'abcdef0123456789';
  var charactersLength = characters.length;

  for ( var i = 0; i < length; i++ ) {
    result.push(characters.charAt(Math.floor(Math.random() * charactersLength)));
  }

  return result.join('');
}

function generateDocument() {
  return {
    lastName: randomAlphaNumeric(8),
    creationDate: new Date(),
    stringFixedLength: randomAlphaNumeric(8),
    stringVariableLength: randomAlphaNumeric(randomInteger(5, 50)),
    integer1: NumberInt(randomInteger(0, 2000000)),
    long1: NumberLong(randomInteger(0, 100000000)),
    date1: new Date(),
    guid1: new UUID()
  };
}

for (var j = 0; j < 500; j++) {
  var batch=[];

  for (var i = 0; i < 1000; i++) {
    batch.push(
      {insertOne: {
          document: generateDocument() 
        } 
      }
    );
  }
  
  db.mycollection.bulkWrite(batch, {ordered: false});
}

Give a few minutes and review in the mongoshell, if we now look at the shard status we should see chunks distributed across both shards...

sh.status()

... we should see something similar to ...

mongos> sh.status()
--- Sharding Status --- 
  sharding version: {
    "_id" : 1,
    "minCompatibleVersion" : 5,
    "currentVersion" : 6,
    "clusterId" : ObjectId("6101c030a98b2cc106034695")
  }
  shards:
        {  "_id" : "mongors1",  "host" : "mongors1/mongors1n1:27018,mongors1n2:27018,mongors1n3:27018",  "state" : 1,  "topologyTime" : Timestamp(1627504744, 1) }
        {  "_id" : "mongors2",  "host" : "mongors2/mongors2n1:27018,mongors2n2:27018,mongors2n3:27018",  "state" : 1,  "topologyTime" : Timestamp(1627504753, 1) }
  active mongoses:
        "5.0.1" : 2
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled: yes
        Currently running: yes
        Collections with active migrations: 
                config.system.sessions started at Wed Jul 28 2021 20:44:25 GMT+0000 (UTC)
        Failed balancer rounds in last 5 attempts: 0
        Migration results for the last 24 hours: 
                60 : Success
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                mongors1    965
                                mongors2    59
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "mydatabase",  "primary" : "mongors2",  "partitioned" : true,  "version" : {  "uuid" : UUID("bc890722-00c6-4cbe-a3e1-eab9692faf93"),  "timestamp" : Timestamp(1627504768, 2),  "lastMod" : 1 } }
                mydatabase.mycollection
                        shard key: { "lastName" : 1, "creationDate" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                mongors1    2
                                mongors2    3
                        { "lastName" : { "$minKey" : 1 }, "creationDate" : { "$minKey" : 1 } } -->> {
                            "lastName" : "00001276",
                            "creationDate" : ISODate("2021-07-28T20:42:00.867Z")
                        } on : mongors1 Timestamp(2, 0) 
                        {
                            "lastName" : "00001276",
                            "creationDate" : ISODate("2021-07-28T20:42:00.867Z")
                        } -->> {
                            "lastName" : "623292c2",
                            "creationDate" : ISODate("2021-07-28T20:42:01.046Z")
                        } on : mongors1 Timestamp(3, 0) 
                        {
                            "lastName" : "623292c2",
                            "creationDate" : ISODate("2021-07-28T20:42:01.046Z")
                        } -->> {
                            "lastName" : "c3f2a99a",
                            "creationDate" : ISODate("2021-07-28T20:42:06.474Z")
                        } on : mongors2 Timestamp(3, 1) 
                        {
                            "lastName" : "c3f2a99a",
                            "creationDate" : ISODate("2021-07-28T20:42:06.474Z")
                        } -->> {
                            "lastName" : "ed75c36c",
                            "creationDate" : ISODate("2021-07-28T20:42:03.984Z")
                        } on : mongors2 Timestamp(1, 6) 
                        {
                            "lastName" : "ed75c36c",
                            "creationDate" : ISODate("2021-07-28T20:42:03.984Z")
                        } -->> { "lastName" : { "$maxKey" : 1 }, "creationDate" : { "$maxKey" : 1 } } on : mongors2 Timestamp(2, 1) 

... Here we can see evidence of balancing activites. See label "chunks" for mongors1 and mongors2. While it is balancing our test collection it is also pre-splitting and balancing a different collection for session data. I believe this is a one-time system automation.

I hope these details help. Please let me know if you have any other questions.

这篇关于第一次使用 MongoDB + Docker - 从 docker compose 设置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆