Docker容器中的并行代码执行 [英] Parallel code execution in Docker containers
问题描述
我有一个脚本,可以按URLslist抓取数据。
该脚本正在docker容器中执行。
我想在多个实例中运行它,例如20。
为此,我想使用 docker-compose scale worker = 20
并向每个实例传递INDEX,以便脚本知道应删除哪些URL。
I have a script that scrapes data by URLslist.
This script is executing in a docker container.
I would like to run it in multiple instances, for example, 20.
For that, I wanted to use docker-compose scale worker=20
and to pass the INDEX to each instance so that the script knows which URLs should be scrapped.
示例。
ID, URL
0 https://example.org/sdga2
1 https://example.org/fsdh34
2 https://example.org/fs4h35
3 https://example.org/f1h36
4 https://example.org/fs4h37
...
如果存在3个实例,脚本的第一个实例应处理一个URL,其ID等于0、3、6、9,即ID = INDEX + INSTANCES_NUM * k。
If there are 3 instances, 1st instance of script should process a url whose ID equals to 0, 3, 6, 9 i.e. ID = INDEX + INSTANCES_NUM * k.
我不知道如何将INDEX传递给在Docker容器中运行的脚本。
当然,我可以在docker-compose.yml中使用环境变量中的不同INDEX复制服务。但是,如果实例数大于10,甚至大于50,将是一个非常糟糕的解决方案)
I don't know how to pass INDEX to script running in Docker container. Of course, I can duplicate services in docker-compose.yml with different INDEX in environment vars. But if instances number is greater 10 or even 50 it will be a very bad solution)
有人知道怎么做吗?
推荐答案
使用 docker-compose
,我不认为对此有任何支持。但是,在可以使用类似撰写文件的群集模式下,您可以使用 {{.Task.Slot}} 作为环境变量进行传递: //docs.docker.com/engine/reference/commandline/#create-services-using-templates rel = nofollow noreferrer>服务模板。例如,
With docker-compose
, I don't believe there's any support for this. However, with swarm mode, which can use a similar compose file, you can pass {{.Task.Slot}}
as an environment variable using service templates. E.g.
version: '3'
services:
test:
image: busybox
command: /bin/sh -c "echo My task number is $$task_id && tail -f /dev/null"
environment:
task_id: "{{.Task.Slot}}"
deploy:
replicas: 5
而不是 docker-compose up
,我使用 docker stack deploy进行部署-c docker-compose.yml test
。我的本地群集集群只是使用 docker swarm init
创建的单个节点。
Instead of docker-compose up
, I deploy with docker stack deploy -c docker-compose.yml test
. My local swarm cluster is just a single node created with docker swarm init
.
然后,检查每个运行的群集容器:
Then, reviewing each of these running containers:
$ docker ps --filter label=com.docker.swarm.service.name=test_test
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ccd0dbebbcbe busybox:latest "/bin/sh -c 'echo My…" About a minute ago Up About a minute test_test.3.i3jg6qrg09wjmntq1q17690q4
bfaa22fa3342 busybox:latest "/bin/sh -c 'echo My…" About a minute ago Up About a minute test_test.5.iur5kg6o3hn5wpmudmbx3gvy1
a372c0ce39a2 busybox:latest "/bin/sh -c 'echo My…" About a minute ago Up About a minute test_test.4.rzmhyjnjk00qfs0ljpfyyjz73
0b47d19224f6 busybox:latest "/bin/sh -c 'echo My…" About a minute ago Up About a minute test_test.1.tm97lz6dqmhl80dam6bsuvc8j
c968cb5dbb5f busybox:latest "/bin/sh -c 'echo My…" About a minute ago Up About a minute test_test.2.757e8evknx745120ih5lmhk34
$ docker ps --filter label=com.docker.swarm.service.name=test_test -q | xargs -n 1 docker logs
My task number is 3
My task number is 5
My task number is 4
My task number is 1
My task number is 2
这篇关于Docker容器中的并行代码执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!