在 docker 构建中使用 pip 缓存目录 [英] Using a pip cache directory in docker builds

查看:67
本文介绍了在 docker 构建中使用 pip 缓存目录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望尽快在我的 docker build 中获取我的 pip install 指令.

我已经阅读了很多 帖子 解释如何添加您的 requirements.txt 如果您的 requirements.txt 没有改变,应用程序的其余部分可以帮助您利用 Docker 自己的图像缓存.但是,当依赖关系发生变化时,这根本没有帮助,即使是轻微的变化.

下一步是我们是否可以使用一致的 pip 缓存目录.默认情况下,pip 会将下载的包缓存在 ~/.cache/pip(在 Linux 上)中,所以如果你曾经安装过相同版本的模块之前已经安装在系统上的任何地方,它不应该需要再次下载它,而只需使用缓存的版本.如果我们可以利用共享缓存目录进行 docker 构建,这将有助于大大加快依赖项安装.

然而,在运行 docker build 时似乎没有任何简单的方法来安装卷.构建环境似乎基本无法穿透.我找到了一篇文章建议在主机上运行 rsync 服务器的天才但复杂的方法,然后在构建内部进行黑客攻击以获取主机 IP,从主机同步 pip 缓存.但我并不喜欢在 Jenkins 中运行 rsync 服务器的想法(这在最好的时候并不是最安全的平台).

有谁知道有没有其他方法可以更简单地实现共享缓存卷?

解决方案

我建议你使用 buildkit,另见这个一>.

Dockerfile:

# syntax = docker/dockerfile:experimental来自 python:3.6-alpine运行 --mount=type=cache,target=/root/.cache/pip pip install pyyaml

注意:#syntax = docker/dockerfile:experimental 是必须的,你必须在Dockerfile的开头添加它才能启用此功能.

1.

第一次执行构建:

导出 DOCKER_BUILDKIT=1docker build --progress=plain -t abc:1 .--无缓存

第一条日志:

#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...#9 摘要:sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5#9 名称:[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml";#9 开始:2019-09-20 03:11:35.296107357 +0000 UTC#9 1.955 收集 pyyaml#9 3.050 下载 https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.226tg.zB3212641ee2718d556df0f23f78#9 5.006 为收集的包构建轮子:pyyaml#9 5.007 pyyaml 的构建轮(setup.py):开始#9 5.249 pyyaml 的构建轮(setup.py):完成状态为完成"#9 5.250 为 pyyaml 创建轮子:文件名=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl 大小=44104 sha256=867daf35eab43c2d047ad737ea1e9aeeb410152830aec41015283000#9 5.250 存放在目录:/root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030#9 5.267 成功构建pyyaml#9 5.274 安装收集到的包:pyyaml#9 5.309 成功安装pyyaml-5.1.2#9已完成:2019-09-20 03:11:42.221146294 +0000 UTC#9 持续时间:6.925038937s

从上面可以看到,第一次构建会从网络上下载pyyaml.

2.

第二次执行构建:

docker build --progress=plain -t abc:1 .--无缓存

第二个日志:

#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...#9 摘要:sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5#9 名称:[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml";#9 开始:2019-09-20 03:16:58.588157354 +0000 UTC#9 1.786 收集 pyyaml#9 2.234 安装收集到的包:pyyaml#9 2.270 成功安装pyyaml-5.1.2#9已完成:2019-09-20 03:17:01.933398002 +0000 UTC#9 持续时间:3.345240648s

从上面可以看到构建不再从互联网上下载包,只使用缓存.注意,这不是传统的 docker 构建缓存,因为我使用了 --no-cache,它是我安装到构建中的 /root/.cache/pip.>

3.

第三次执行构建删除构建包缓存:

docker builder 修剪docker build --progress=plain -t abc:1 .--无缓存

第三条日志:

#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...#9 摘要:sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5#9 名称:[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml";#9 开始:2019-09-20 03:19:07.434792944 +0000 UTC#9 1.894 收集 pyyaml#9 2.740 下载 https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.226tg.zB3212641ee2718d556df0f23f78#9 3.319 为收集的包构建轮子:pyyaml#9 3.319 pyyaml 的构建轮(setup.py):开始#9 3.560 pyyaml 的构建轮(setup.py):完成状态为完成"#9 3.560 为 pyyaml 创建的轮子:文件名=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl 大小=44104 sha256=cea5bc4689e231df7915c2fc3abca225d4a725ac8000#9 3.560 存放在目录:/root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030#9 3.580 成功构建pyyaml#9 3.585 安装收集到的包:pyyaml#9 3.622 成功安装pyyaml-5.1.2#9已完成:2019-09-20 03:19:12.530742712 +0000 UTC#9 持续时间:5.095949768s

从上面可以看到如果删除buildkit缓存,重新下载包.

一句话,它会给你一个在多次构建之间共享的缓存,并且这个缓存只会在镜像构建时被挂载.但是,图像本身不会有这些缓存,因此避免图像中的大量中间层.

为正在使用 docker compose 并且懒得阅读评论的人编辑...:

<块引用>

如果你设置了,你也可以用 docker-compose 来做到这一点COMPOSE_DOCKER_CLI_BUILD=1.例如:COMPOSE_DOCKER_CLI_BUILD=1DOCKER_BUILDKIT=1 docker-compose build –

根据人们的问题 2020/09/02 更新:

不知道是从哪个版本开始的(我现在的版本是19.03.11),如果缓存目录没有指定mode,下次构建时缓存不会被重用.>

不知道具体原因,但你可以将mode=0755,添加到Dockerfile使其再次工作:

Dockerfile:

# syntax = docker/dockerfile:experimental来自 python:3.6-alpineRUN --mount=type=cache,mode=0755,target=/root/.cache/pip pip install pyyaml

I'm hoping to get my pip install instructions inside my docker builds as fast as possible.

I've read many posts explaining how adding your requirements.txt before the rest of the app helps you take advantage of Docker's own image cache if your requirements.txt hasn't changed. But this is no help at all when dependencies do change, even slightly.

The next step would be if we could use a consistent pip cache directory. By default, pip will cache downloaded packages in ~/.cache/pip (on Linux), and so if you're ever installing the same version of a module that has been installed before anywhere on the system, it shouldn't need to go and download it again, but instead simply use the cached version. If we could leverage a shared cache directory for docker builds, this could help speed up dependency installs a lot.

However, there doesn't appear to be any simple way to mount a volume while running docker build. The build environment seems to be basically impenetrable. I found one article suggesting a genius but complex method of running an rsync server on the host and then, with a hack inside the build to get the host IP, rsyncing the pip cache in from the host. But I'm not relishing the idea of running an rsync server in Jenkins (which isn't the most secure platform at the best of times).

Does anyone know if there's any other way to achieve a shared cache volume more simply?

解决方案

I suggest you to use buildkit, also see this.

Dockerfile:

# syntax = docker/dockerfile:experimental
FROM python:3.6-alpine
RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml

NOTE: # syntax = docker/dockerfile:experimental is a must,you have to add it at the beginning of Dockerfile to enable this feature.

1.

The first execute build:

export DOCKER_BUILDKIT=1
docker build --progress=plain -t abc:1 . --no-cache

The first log:

#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9   digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9  started: 2019-09-20 03:11:35.296107357 +0000 UTC
#9 1.955 Collecting pyyaml
#9 3.050   Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
#9 5.006 Building wheels for collected packages: pyyaml
#9 5.007   Building wheel for pyyaml (setup.py): started
#9 5.249   Building wheel for pyyaml (setup.py): finished with status 'done'
#9 5.250   Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44104 sha256=867daf35eab43c2d047ad737ea1e9eaeb4168b87501cd4d62c533f671208acaa
#9 5.250   Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030
#9 5.267 Successfully built pyyaml
#9 5.274 Installing collected packages: pyyaml
#9 5.309 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:11:42.221146294 +0000 UTC
#9 duration: 6.925038937s

From above, you can see the first time, the build will download pyyaml from internet.

2.

The second execute build:

docker build --progress=plain -t abc:1 . --no-cache

The second log:

#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9   digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9  started: 2019-09-20 03:16:58.588157354 +0000 UTC
#9 1.786 Collecting pyyaml
#9 2.234 Installing collected packages: pyyaml
#9 2.270 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:17:01.933398002 +0000 UTC
#9 duration: 3.345240648s

From above, you can see the build no longer download package from internet, just use the cache. NOTE, this is not the traditional docker build cache as I have use --no-cache, it's /root/.cache/pip which I mount into build.

3.

The third execute build which delete buildkit cache:

docker builder prune
docker build --progress=plain -t abc:1 . --no-cache

The third log:

#9 [stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install...
#9   digest: sha256:55b70da1cbbe4d424f8c50c0678a01e855510bbda9d26f1ac5b983808f3bf4a5
#9 name: "[stage-0 2/2] RUN --mount=type=cache,target=/root/.cache/pip pip install pyyaml"
#9  started: 2019-09-20 03:19:07.434792944 +0000 UTC
#9 1.894 Collecting pyyaml
#9 2.740   Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)
#9 3.319 Building wheels for collected packages: pyyaml
#9 3.319   Building wheel for pyyaml (setup.py): started
#9 3.560   Building wheel for pyyaml (setup.py): finished with status 'done'
#9 3.560   Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44104 sha256=cea5bc4689e231df7915c2fc3abca225d4ee2e869a7540682aacb6d42eb17053
#9 3.560   Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030
#9 3.580 Successfully built pyyaml
#9 3.585 Installing collected packages: pyyaml
#9 3.622 Successfully installed pyyaml-5.1.2
#9completed: 2019-09-20 03:19:12.530742712 +0000 UTC
#9 duration: 5.095949768s

From above, you can see if delete buildkit cache, the package download again.

In a word, it will give you a shared cache between several times build, and this cache will only be mounted when image build. But, the image self will not have these cache, so avoid a lots of intermediate layer in image.

EDIT for folks who are using docker compose and are lazy to read the comments...:

You can also do this with docker-compose if you set COMPOSE_DOCKER_CLI_BUILD=1. For example: COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 docker-compose build –

UPDATE according to folk's question 2020/09/02:

I don't know from which version (my version now is 19.03.11), if not specify mode for cache directory, the cache won't be reused by next time build.

Don't know the detail reason, but you could add mode=0755, to Dockerfile to make it work again:

Dockerfile:

# syntax = docker/dockerfile:experimental
FROM python:3.6-alpine
RUN --mount=type=cache,mode=0755,target=/root/.cache/pip pip install pyyaml

这篇关于在 docker 构建中使用 pip 缓存目录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆