如何在多主机之间创建 docker 覆盖网络? [英] how to create docker overlay network between multi hosts?

查看:25
本文介绍了如何在多主机之间创建 docker 覆盖网络?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试在两台主机之间创建覆盖网络,但没有成功.我不断收到错误消息:

I have been trying to create an overlay network between two hosts with no success. I keep getting the error message:

mavungu@mavungu-Aspire-5250:~$ sudo docker -H tcp://192.168.0.18:2380 network create -d overlay myapp
Error response from daemon: 500 Internal Server Error: failed to parse pool request for address space "GlobalDefault" pool "" subpool "": cannot find address space GlobalDefault (most likely the backing datastore is not configured)

mavungu@mavungu-Aspire-5250:~$ sudo docker network create -d overlay myapp
[sudo] password for mavungu:
Error response from daemon: failed to parse pool request for address space "GlobalDefault" pool "" subpool "": cannot find address space GlobalDefault (most likely the backing datastore is not configured)

我的环境详情:

mavungu@mavungu-Aspire-5250:~$ sudo docker info Containers: 1
Images: 364 Server Version: 1.9.1 Storage Driver: aufs Root Dir:
/var/lib/docker/aufs Backing Filesystem: extfs Dirs: 368 Dirperm1
Supported: true Execution Driver: native-0.2 Logging Driver:
json-file Kernel Version: 3.19.0-26-generic Operating System: Ubuntu
15.04 CPUs: 2 Total Memory: 3.593 GiB Name: mavungu-Aspire-5250 Registry: https://index.docker.io/v1/ WARNING: No swap limit support

我有一个 swarm 集群,使用 consul 作为发现机制:

I have a swarm cluster working well with consul as the discovery mechanism:

mavungu@mavungu-Aspire-5250:~$ sudo docker -H tcp://192.168.0.18:2380 info 

Containers: 4 
Images: 51 
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 2
mavungu-Aspire-5250: 192.168.0.36:2375
└ Containers: 1
└ Reserved CPUs: 0 / 2
└ Reserved Memory: 0 B / 3.773 GiB
└ Labels: executiondriver=native-0.2, kernelversion=3.19.0-26-generic, operatingsystem=Ubuntu 15.04, storagedriver=aufs
mavungu-HP-Pavilion-15-Notebook-PC: 192.168.0.18:2375
└ Containers: 3
└ Reserved CPUs: 0 / 4
└ Reserved Memory: 0 B / 3.942 GiB
└ Labels: executiondriver=native-0.2, kernelversion=4.2.0-19-generic, operatingsystem=Ubuntu 15.10, storagedriver=aufs
CPUs: 6
Total Memory: 7.715 GiB
Name: bb47f4e57436

我的 consul 在 192.168.0.18:8500 可用,它适用于 swarm 集群.

My consul is available at 192.168.0.18:8500 and it works well with the swarm cluster.

我希望能够跨两台主机创建覆盖网络.我已经使用以下附加设置在两台主机上配置了 docker 引擎:

I would like to be able to create an overlay network across the two hosts. I have configured the docker engines on both hosts with this additional settings:

DOCKER_OPTS="-D --cluster-store-consul://192.168.0.18:8500 --cluster-advertise=192.168.0.18:0"

DOCKER_OPTS="-D --cluster-store-consul://192.168.0.18:8500 --cluster-advertise=192.168.0.36:0"

我不得不停止并重新启动引擎并重置 swarm 集群...在创建覆盖网络失败后,我将 --cluster-advertise 设置更改为:

I had to stop and restart the engines and reset the swarm cluster... After failing to create the overlay network, I changed the --cluster-advertise setting to this :

DOCKER_OPTS="-D --cluster-store-consul://192.168.0.18:8500 --cluster-advertise=192.168.0.18:2375"

DOCKER_OPTS="-D --cluster-store-consul://192.168.0.18:8500 --cluster-advertise=192.168.0.36:2375"

但还是不行.我不确定应该为 --cluster-advertise= 设置什么 ip:port.文档、讨论和教程都不清楚这个广告的东西.

But still it did not work. I am not sure of what ip:port should be set for the --cluster-advertise= . Docs, discussions and tutorials are not clear on this advertise thing.

这里有些地方我弄错了.请帮忙.

There is something that I am getting wrong here. Please help.

推荐答案

执行docker run命令时,一定要加上--net myapp.这是一个完整的分步教程(在线版本):

When you execute the docker runcommand, be sure to add --net myapp. Here is a full step-by-step tutorial (online version):

TL;DR: 使用 Swarm.我想尽快把这个教程放到网上,所以我什至没有花时间做演示.markdown 文件可在 我网站的 github.随意改编和分享它,它根据知识共享署名 4.0 国际许可协议获得许可.

TL;DR: step-by-step tutorial to deploy a multi-hosts network using Swarm. I wanted to put online this tutorial ASAP so I didn't even take time for the presentation. The markdown file is available on the github of my website. Feel free to adapt and share it, it is licensed under a Creative Commons Attribution 4.0 International License.

  • Tutorial done with docker engine 1.9.0.
  • Swarm agents are discovered through a shared file (other methods are available).
  • Consul 0.5.2 is used for discovery for the multi-hosts network swarm containers.

Swarm 经理和 consul 主人将在名为 bugs20 的机器上运行.其他节点,bugs19、bugs18、bugs17 和 bugs16,将是 swarm 代理和 领事成员.

Swarm manager and consul master will be run on the machine named bugs20. Other nodes, bugs19, bugs18, bugs17 and bugs16, will be swarm agents and consul members.

Consul 用于多主机网络,可以使用任何其他键值存储——注意引擎支持 Consul Etcd 和 ZooKeeper.令牌(或静态文件)用于 swarm 代理发现.令牌使用 REST API,最好使用静态文件.

Consul is used for the multihost networking, any other key-value store can be used -- note that the engine supports Consul Etcd, and ZooKeeper. Token (or static file) are used for the swarm agents discovery. Tokens use REST API, a static file is preferred.

网络范围为 192.168.196.0/25.名为 bugsN 的主机 IP 地址为 192.168.196.N.

The network is range 192.168.196.0/25. The host named bugsN has the IP address 192.168.196.N.

所有节点都在运行 docker daemon,如下所示:

All nodes are running docker daemon as follow:

/usr/bin/docker daemon -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock --cluster-advertise eth0:2375 --cluster-store consul://127.0.0.1:8500

选项详情:

-H tcp://0.0.0.0:2375

将守护进程绑定到接口以允许成为 swarm 集群的一部分.一个IP地址显然可以指定,如果你有多个网卡,这是一个更好的解决方案.

Binds the daemon to an interface to allow be part of the swarm cluster. An IP address can obviously be specificied, it is a better solution if you have several NIC.

--cluster-advertise eth0:2375

定义 docker 守护进程的接口和端口应该用来宣传自己.

Defines the interface and the port of the docker daemon should use to advertise itself.

--cluster-store consul://127.0.0.1:8500

定义分布式存储后端的 URL.在我们的例子中,我们使用 consul,虽然还有其他发现工具可以使用,如果你想下定决心,你应该对阅读此服务发现比较.

Defines the URL of the distributed storage backend. In our case we use consul, though there are other discovery tools that can be used, if you want to make up your mind you should be interested in reading this service discovery comparison.

由于 consul 是分布式的,因此 URL 可以是本地的(请记住,swarm 代理也是 consul 成员),这更加灵活,因为您不必指定 IP 地址consul master 并在 docker daemon 启动后被选中.

As consul is distributed, the URL can be local (remember, swarm agents are also consul members) and this is more flexible as you don't have to specify the IP address of the consul master and be selected after the docker daemon has been started.

在以下命令中使用了这两个别名:

In the following commands these two aliases are used:

alias ldocker='docker -H tcp://0.0.0.0:2375'
alias swarm-docker='docker -H tcp://0.0.0.0:5732' #used only on the swarm manager

确保你的 $PATH 中有 consul 二进制文件的路径.进入目录后,只需键入 export PATH=$PATH:$(pwd) 即可.

Be sure to have the path of the consul binary in your $PATH. Once you are in the directory just type export PATH=$PATH:$(pwd) will do the trick.

还假设变量 $IP 已正确设置和导出.可以通过 .bashrc.zshrc 或其他方式完成,如下所示:

It is also assumed that the variable $IP has been properly set and exported. It can be done, thanks to .bashrc or .zshrc or else, with something like this:

export IP=$(ifconfig |grep "192.168.196."|cut -d ":" -f 2|cut -d " " -f 1)

领事

让我们根据需要开始部署所有 consul 成员和 master.

Consul

Let's start to deploy all consul members and master as needed.

consul agent -server -bootstrap-expect 1 -data-dir /tmp/consul -node=master -bind=$IP -client $IP

选项详情:

agent -server

启动 consul 代理作为服务器.

Start the consul agent as a server.

-bootstrap-expect 1

我们希望只有一位大师.

We expect only one master.

-node=master20

这个consul服务器/主服务器将被命名为master20".

This consul server/master will be named "master20".

-bind=192.168.196.20

指定应绑定的 IP 地址.如果您只有一个 NIC,则为可选.

Specifies the IP address on which it should be bound. Optional if you have only one NIC.

-client=192.168.196.20

指定应该绑定服务器的 RPC IP 地址.默认情况下它是本地主机.请注意,我不确定此选项的必要性,并且这种强制添加 -rpc-addr=192.168.196.20:8400 用于本地请求,例如 consul members -rpc-addr=192.168.196.20:8400consul join -rpc-addr=192.168.196.20:8400 192.168.196.9 加入consul IP 地址为 192.168.196.9 的成员.

Specifies the RPC IP address on which the server should be bound. By default it is localhost. Note that I am unsure about the necessity of this option, and this force to add -rpc-addr=192.168.196.20:8400 for local request such as consul members -rpc-addr=192.168.196.20:8400 or consul join -rpc-addr=192.168.196.20:8400 192.168.196.9 to join the consul member that has the IP address 192.168.196.9.

consul agent -data-dir /tmp/consul -node=$HOSTNAME -bind=192.168.196.N

建议使用 tmux 或类似的,带有选项 :setw synchronize-panes on所以这个命令: consul -d agent -data-dir/tmp/consul -node=$HOST -bind=$IP 启动所有 领事成员.

It is suggested to use tmux, or similar, with the option :setw synchronize-panes on so this one command: consul -d agent -data-dir /tmp/consul -node=$HOST -bind=$IP starts all consul members.

consul join -rpc-addr=192.168.196.20:8400 192.168.196.16
consul join -rpc-addr=192.168.196.20:8400 192.168.196.17
consul join -rpc-addr=192.168.196.20:8400 192.168.196.18
consul join -rpc-addr=192.168.196.20:8400 192.168.196.19

也可以使用单行命令.如果您使用的是 zsh,那么 consul join -rpc-addr=192.168.196.20:8400 192.168.196.{16..19} 就足够了,或者一个 foror 循环: for i in$(seq 16 1 19);领事加入 -rpc-addr=192.168.196.20:8400 192.168.196.$i;done.您可以使用以下命令验证您的成员是否是 consul 部署的一部分:

A one line command can be used too. If you are using zsh, then consul join -rpc-addr=192.168.196.20:8400 192.168.196.{16..19} is enough, or a foor loop: for i in $(seq 16 1 19); do consul join -rpc-addr=192.168.196.20:8400 192.168.196.$i;done. You can verify if your members are part of your consul deployment with the command:

consul members -rpc-addr=192.168.196.20:8400
Node      Address              Status  Type    Build  Protocol  DC
master20  192.168.196.20:8301  alive   server  0.5.2  2         dc1
bugs19    192.168.196.19:8301  alive   client  0.5.2  2         dc1
bugs18    192.168.196.18:8301  alive   client  0.5.2  2         dc1
bugs17    192.168.196.17:8301  alive   client  0.5.2  2         dc1
bugs16    192.168.196.16:8301  alive   client  0.5.2  2         dc1

Consul 成员和主人已部署并正在工作.现在的重点是 docker 和 swarm.

Consul members and master are deployed and working. The focus will now be on docker and swarm.

在下文中,将使用两种不同的方法详细说明 swarm manager 和 swarm 成员发现的创建:令牌和静态文件.令牌使用 Docker Hub 的托管发现服务,而静态文件只是本地文件,不使用网络(也不使用任何服务器).应该首选静态文件解决方案(实际上更容易).

In the following the creation of swarm manager and swarm members discovery are detailed using two different methods: token and static file. Tokens use a hosted discovery service with Docker Hub while static file is just local and does not use the network (nor any server). Static file solution should be preferred (and is actually easier).

创建一个名为 /tmp/cluster.disco 的文件,其内容为 swarm_agent_ip:2375.

Create a file named /tmp/cluster.disco with the content swarm_agent_ip:2375.

cat /tmp/cluster.disco
192.168.196.16:2375
192.168.196.17:2375
192.168.196.18:2375
192.168.196.19:2375

然后按如下方式启动 swarm manager:

Then just start the swarm manager as follow:

ldocker run -v /tmp/cluster.disco:/tmp/cluster.disco -d -p 5732:2375 swarm manage file:///tmp/cluster.disco

你就完成了!

在 swarm master (bugs20) 上,创建一个 swarm:

On the swarm master (bugs20), create a swarm:

ldocker run --rm swarm create > swarm_id

这将创建一个 swarm 并将令牌 ID 保存在当前目录的文件 swarm_id 中.创建后,swarm manager 需要作为守护进程运行:

This create a swarm and save the token ID in the file swarm_id of the current directory. Once created, the swarm manager need to be run as a daemon:

ldocker run -d -p 5732:2375 swarm manage token://`cat swarm_id`

要验证它是否已启动,您可以运行:

To verify if it is started you can run:

ldocker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
d28238445532        swarm               "/swarm manage token:"   5 seconds ago       Up 4 seconds        0.0.0.0:5732->2375/tcp   cranky_liskov

[token] 将 swarm 成员加入 swarm 集群

那么 swarm manager 将需要一些 swarm agent 加入.

[token] Join swarm members into the swarm cluster

Then the swarm manager will need some swarm agent to join.

ldocker run swarm join --addr=192.168.196.16:2375 token://`cat swarm_id`
ldocker run swarm join --addr=192.168.196.17:2375 token://`cat swarm_id`
ldocker run swarm join --addr=192.168.196.18:2375 token://`cat swarm_id`
ldocker run swarm join --addr=192.168.196.19:2375 token://`cat swarm_id`

std[in|out] 会很忙,这些命令需要在不同的终端上运行.在 join 之前添加 -d 应该可以解决这个问题,并使 for 循环可以用于连接.

std[in|out] will be busy, these commands need to be ran on different terminals. Adding -d abefore the join should solve this and enables a for-loop to be used for the joins.

swarm 成员加入后:

After the join of the swarm members:

auzias@bugs20:~$ ldocker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
d1de6e4ee3fc        swarm               "/swarm join --addr=1"   5 seconds ago       Up 4 seconds        2375/tcp                 fervent_lichterman
338572b87ce9        swarm               "/swarm join --addr=1"   6 seconds ago       Up 4 seconds        2375/tcp                 mad_ramanujan
7083e4d6c7ea        swarm               "/swarm join --addr=1"   7 seconds ago       Up 5 seconds        2375/tcp                 naughty_sammet
0c5abc6075da        swarm               "/swarm join --addr=1"   8 seconds ago       Up 6 seconds        2375/tcp                 gloomy_cray
ab746399f106        swarm               "/swarm manage token:"   25 seconds ago      Up 23 seconds       0.0.0.0:5732->2375/tcp   ecstatic_shockley

群成员发现后

要验证成员是否被发现,可以执行swarm-docker info:

auzias@bugs20:~$ swarm-docker info
Containers: 4
Images: 4
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 4
 bugs16: 192.168.196.16:2375
  └ Containers: 0
  └ Reserved CPUs: 0 / 12
  └ Reserved Memory: 0 B / 49.62 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), storagedriver=aufs
 bugs17: 192.168.196.17:2375
  └ Containers: 0
  └ Reserved CPUs: 0 / 12
  └ Reserved Memory: 0 B / 49.62 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), storagedriver=aufs
 bugs18: 192.168.196.18:2375
  └ Containers: 0
  └ Reserved CPUs: 0 / 12
  └ Reserved Memory: 0 B / 49.62 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), storagedriver=aufs
 bugs19: 192.168.196.19:2375
  └ Containers: 4
  └ Reserved CPUs: 0 / 12
  └ Reserved Memory: 0 B / 49.62 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), storagedriver=aufs
CPUs: 48
Total Memory: 198.5 GiB
Name: ab746399f106

此时 swarm 已部署,所有运行的容器都将运行在不同的节点上.通过执行几个:

At this point swarm is deployed and all containers run will be run over different nodes. By executing several:

auzias@bugs20:~$ swarm-docker run --rm -it ubuntu bash

然后是:

auzias@bugs20:~$ swarm-docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
45b19d76d38e        ubuntu              "bash"              6 seconds ago       Up 5 seconds                            bugs18/boring_mccarthy
53e87693606e        ubuntu              "bash"              6 seconds ago       Up 5 seconds                            bugs16/amazing_colden
b18081f26a35        ubuntu              "bash"              6 seconds ago       Up 4 seconds                            bugs17/small_newton
f582d4af4444        ubuntu              "bash"              7 seconds ago       Up 4 seconds                            bugs18/naughty_banach
b3d689d749f9        ubuntu              "bash"              7 seconds ago       Up 4 seconds                            bugs17/pensive_keller
f9e86f609ffa        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs16/pensive_cray
b53a46c01783        ubuntu              "bash"              7 seconds ago       Up 4 seconds                            bugs18/reverent_ritchie
78896a73191b        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs17/gloomy_bell
a991d887a894        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs16/angry_swanson
a43122662e92        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs17/pensive_kowalevski
68d874bc19f9        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs16/modest_payne
e79b3307f6e6        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs18/stoic_wescoff
caac9466d86f        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs17/goofy_snyder
7748d01d34ee        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs16/fervent_einstein
99da2a91a925        ubuntu              "bash"              7 seconds ago       Up 5 seconds                            bugs18/modest_goodall
cd308099faac        ubuntu              "bash"              7 seconds ago       Up 6 seconds                            bugs19/furious_ritchie

如图所示,容器是通过错误传播的{16...19}.

As shown, the containers are disseminated over bugs{16...19}.

需要一个网络覆盖,以便所有容器都可以插入"这个覆盖.要创建此网络覆盖,请执行:

A network overlay is needed so all the containers can be "plugged in" this overlay. To create this network overlay, execute:

auzias@bugs20:~$ swarm-docker network create -d overlay net
auzias@bugs20:~$ swarm-docker network ls|grep "net"
c96760503d06        net                 overlay

瞧!

创建此叠加层后,将 --net net 添加到命令 swarm-docker run --rm -it ubuntu bash 中,您的所有容器都将能够就好像它们在同一个 LAN 上一样进行本地通信.默认网络为 10.0.0.0/24.

Once this overlay is created, add --net net to the command swarm-docker run --rm -it ubuntu bash and all your containers will be able to communicate natively as if they were on the same LAN. The default network is 10.0.0.0/24.

默认覆盖不支持多播.需要另一个驱动程序才能使用多播.docker 插件 weave net 确实支持多播.

Multicast is not support by the default overlay. Another driver is required to be able to use multicast. The docker plugin weave net does support multicast.

要使用此驱动程序,一旦安装,您需要在所有 Swarm 代理和 Swarm 管理器上运行 $weave launch.然后您需要将编织连接在一起,这是通过运行 $weave connect $SWARM_MANAGER_IP 来完成的.这显然不是 Swarm 管理器的 IP 地址,但这样做更简洁(或使用除 Swarm 代理之外的其他节点).

To use this driver, once installed, you will need to run $weave launch on all Swarm agents and Swarm manager. Then you'll need to connect the weave together, this is done by running $weave connect $SWARM_MANAGER_IP. It is not obviously the IP address of the Swarm manager but it is cleaner to do so (or use another node than the Swarm agents).

此时 weave 集群已部署,但尚未创建 weave 网络.运行 $swarm-docker network create --driver weave weave-net 将创建名为 weave-net 的 weave 网络.使用 --net weave-net 启动容器将使它们能够共享同一个 LAN 并使用多播.启动此类容器的完整命令示例如下:$swarm-docker run --rm -it --privileged --net=weave-net ubuntu bash.

At this point the weave cluster is deployed, but no weave network has been created. Running $swarm-docker network create --driver weave weave-net will create the weave network named weave-net. Starting containers with the --net weave-net will enable them to share the same LAN and use multicast. Example of a full command to start such containers is: $swarm-docker run --rm -it --privileged --net=weave-net ubuntu bash.

这篇关于如何在多主机之间创建 docker 覆盖网络?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆