使用Jib for Spring Boot优化Docker存储库中的映像存储 [英] Optimizing image storage in Docker repository using Jib for Spring Boot

查看:43
本文介绍了使用Jib for Spring Boot优化Docker存储库中的映像存储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用Jib构建Docker映像是否有助于优化远程Docker存储库存储?

Does using Jib to build Docker images help optimize remote Docker repository storage?

我们在带有Gradle的Docker中使用Spring Boot.当前,我们正在创建带有所有依赖项的标准胖启动jar,然后用它创建映像,如下所示:

We are using Spring Boot in Docker with Gradle. Currently, we are creating standard fat Boot jars with all the dependencies packed inside, and then we create an image with it, like so:

FROM gcr.io/distroless/java:11
COPY ./build/libs/*.jar app.jar
CMD ["app.jar"]

即使实际上只更改了很少的代码,每次构建时也会产生较大的(250 MB)新映像.这是因为胖子罐包含共享的依赖项(不经常更改)和我们的代码.这是私有存储库中存储空间的低效使用,我们希望对此进行更改.

This results in a big (250 MB) new image each time we build, even if very little code is actually changed. This is due to the fact that the fat jar contains both the shared dependencies (which change infrequently) and our code. This is inefficient usage of storage space in our private repository and we would like to change that.

为此,想法如下:

  • 我们创建一个基本映像,该映像仅包含/opt/libs中的依赖项,我们将其称为 spring-base:1.0.0 并推送到我们的私有Docker注册表.

  • We create a base image which contains only the dependencies in /opt/libs, let's call it spring-base:1.0.0 and push to our private Docker registry.

我们使用该图像作为仅包含我们代码的应用程序图像的父级/基础.Dockerfile看起来与此类似(未经演示,仅用于介绍概念):

We use that image as a parent/base of the application image which contains our code only. The Dockerfile looks similar to this (untested, just to present the concept):

FROM our-registry/spring-base:1.0.0
COPY ./build/classes/kotlin/main/* /opt/classes
COPY ./build/resources/main/* /opt/resources
ENTRYPOINT ["java", "-cp", "/opt/libs/*:/opt/resources:/opt/classes", "com.example.MainKt"]

期望这些图像要小得多,并且具有依赖性的大型基础图像仅存储一次,从而节省了大量存储空间.

The expectation is that these images are much smaller, and the big base image with dependencies is stored only once, saving a lot of storage.

我们的一位同事调查了Jib并坚持要做到这一点,但是在阅读了整个文档和FAQ并进行了一些尝试之后,我不确定.我们对其进行了集成,并使用了 ./gradlew jibDockerBuild ,它似乎确实为依赖项,资源和类创建了层次,但是仍然只有一个大的形象.Jib似乎专注于加快构建时间(通过利用Docker层缓存)和可复制的构建,但是我认为,当我们将该映像上传到我们的存储库时,相对于我们当前的解决方案而言,没有任何变化-我们仍将存储静态"依赖项多次,但是现在我们将具有多层,而不是每个新图像中只有一层.

A colleague of ours looked into Jib and insists it does exactly this, but after reading the whole documentation and FAQ and playing around a bit with it, I am not so sure. We integrated it and use ./gradlew jibDockerBuild and it does seem to create layers for the dependencies, resources, and classes, but there is still just one big image. Jib seems to focus on speeding up build times (by utilizing Docker layer caching) and reproducible builds, but I think that when we upload that image to our repository nothing will change relative to our current solution - we will still store the 'static' dependencies multiple times, but now we will have multiple layers instead of just one in each new image.

任何具有Docker和Jib经验的人都可以解释Jib是否为我们提供了我们正在寻找的存储空间优化?

Could anyone with more Docker and Jib experience explain whether Jib gives us the storage space optimization we are looking for?

:在我等待答案的同时,我尝试了所有这些操作,并使用了

while I was waiting for an answer, I played around with all of this and used https://github.com/wagoodman/dive, docker system df and docker images to check sizes and look into the images and layers, and it seems Jib does exactly what we need.

推荐答案

使用Jib构建Docker映像是否有助于优化远程Docker存储库存储?

Does using Jib to build Docker images help optimize remote Docker repository storage?

是的.实际上,由于强大的图像层可重复性,它在很大程度上帮助了这一点.仅使用 Dockerfile 时,通常会完全失去大多数图层的可重复性,因为文件时间戳记会影响检查图层是否相同.例如,即使您的 .class 字节完全没有变化,如果再次生成该文件,也将失去可重复性.这对于罐子来说更糟;不仅其时间戳可以更改,而且jar的元数据(例如 META-INF/MANIFEST.MF )包含编译时信息,包括时间戳,构建工具信息,JVM版本等.在Docker世界中,另一台机器将被认为是不同的.

Yes. Indeed, it helps this to a significant degree, because of the strong image layer reproducibility. When just using Dockerfile, you usually and completely lose reproducibility for most layers, because file timestamps are factored into checking whether layers are identical. For example, even if the bytes of your .class didn't change at all, if the file is generated again, you will lose reproducibility. This is worse for jar; not only its timestamp can change, but jar metadata (for example, META-INF/MANIFEST.MF) contains compile-time information including timestamp, build tool info, JVM version, etc. A jar built on a different machine will be considered different in the Docker world.

即使实际上只更改了很少的代码,每次构建时也会产生较大的(250 MB)新映像.这是因为胖子罐包含共享的依赖关系(不经常更改)和我们的代码.

This results in a big (250 MB) new image each time we build, even if very little code is actually changed. This is due to the fact that the fat jar contains both the shared dependencies (which change infrequently) and our code.

部分纠正大小大(250MB)的问题,但不是因为有胖子.即使生成的映像不是一个胖子,即使您为共享库指定了不同的层,生成的映像的大小也将始终为250MB.最终映像的大小(250MB)将始终包括基础映像的大小( gcr.io/distroless/java:11 )和共享库的大小,无论映像是如何构建的通过哪种工具.

Partially correct that the size is big (250MB), but not because of the fat jar. The size of the built image will always be 250MB even if it is not a fat jar and even if you designated a different layer for shared libraries. The size of your final image (250MB) will always include the size of the base image (gcr.io/distroless/java:11) and the size of the shared libraries no matter how the image is built by which tool.

但是,Docker引擎不会复制其在存储中已经知道的层.同样,远程注册表也不复制存储库中已经存在的层.而且,通常注册管理机构甚至会在不同的存储库中精确存储一层的一个副本.因此,当您仅更新代码(因此更新了jar)时,只有包含该jar的层会占用新的存储空间.Docker和Jib将仅通过网络将新层发送到远程注册表.也就是说,不会发送 gcr.io/distroless/java:11 的基本图像层.

However, Docker engines do not duplicate layers that they already know about in their storage. Likewise, remote registries do not either duplicate layers that already exist in a repository. Moreover, often registries even store exactly one copy of a layer across different repositories. Therefore, when you update only your code (hence your jar), only the layer containing that jar will take up new storage space. And Docker and Jib will send only new layers to remote registries over the network. That is, the base image layers for gcr.io/distroless/java:11 will not be sent.

我们创建一个仅包含/opt/libs中依赖项的基础映像,我们将其称为 spring-base:1.0.0 并推送到我们的私有Docker注册表.

We create a base image which contains only the dependencies in /opt/libs, let's call it spring-base:1.0.0 and push to our private Docker registry.

创建一个单独的映像以仅包含共享库并不是闻所未闻的,而且我已经看到有人尝试这样做.但是,我不认为您打算在概念上将此特殊基础映像视为独立的独立映像,该映像应在组织中的各种映像之间共享.因此,我认为在这种情况下这样做是不合常规的,并且如果仅是想节省存储空间(和网络带宽)的一个想法,那么这个技巧很可能是不必要的.请继续阅读.

Creating a separate image only to contain shared libraries is not something unheard of, and I have seen some people attempting this. However, I don't think you do intent to conceptually treat this special base image as an independent, standalone image that is meant to be shared across different kinds of images in your organization. So I think doing so is unconventional in this situation, and this trick is most likely unnecessary if it is only an idea came off the top of your head regarding saving storage space (and network bandwidth). Please continue to read.

期望这些图像要小得多

The expectation is that these images are much smaller

不.正如我所解释的,无论如何,您都将创建一个大小为250MB的映像.它包括基本映像的大小,其中包括共享库.运行 docker images 时,您本地的Docker引擎将显示图像大小为250MB.但是,正如我说的那样,这并不意味着您每次构建新映像时,Docker引擎都会占用额外的250MB空间.

No. As I explained, you will create an image of same size 250MB no matter what. It includes the size of the base image, which includes your shared libraries. When running docker images, your local Docker engine will show that the image size is 250MB. But as I said, that does not mean your Docker engine takes up additional 250MB of space whenever you build a new image.

具有依赖性的大基础图像仅存储一次

the big base image with dependencies is stored only once

是的,但是当您从 FROM gcr.io/distroless/java:11 开始时,情况也是如此.只要您可以为共享库创建一个单独的共享层并保持该层的稳定(即可重现),将共享库放入另一个基本映像"就没有任何意义.Jib非常擅长可重复地构建这样的层.保存在注册表中的位的粒度是层而不是图像,因此,实际上没有必要标记"库层位于某个基本映像"中(只要您为库创建自己的层).注册中心仅看到层,并且仅通过声明此图像由A层,B层和C层以及该元数据组成"来形成图像"的概念.图像甚至没有基本图像的概念;它并没有说此图像是通过将A层放置在该基本图像之上".只要B层是共享库层,您的优化要比胖子层更好.

Yes, but this can also be true when you start with FROM gcr.io/distroless/java:11. It is meaningless to shove your shared libraries into another "base image", as long as you can create a separate layer of its own for the shared libraries and keep the layer stable (i.e., reproducible). And Jib is very good at reproducibly building such a layer. The granularity of bits saved in registries is layers and not images, so there is really no need to "mark" that the libraries layer is in some "base image" (as long as you create its own layer for the libraries). Registries only see layers, and a notion of an "image" is formed by just declaring that "this image is comprised of layer A, layer B, and layer C along with this metadata." An image doesn't even have a notion of a base image; it doesn't say like "this image is by putting layer A on top of this base image." As long as layer B is a shared libraries layer, you have better optimization than having a fat jar layer.

节省大量存储空间.

saving a lot of storage.

因此,这是不正确的.毕竟,没有充分的理由,Docker引擎和注册表不会多次存储同一层.

Therefore, this is not true. After all, Docker engines and registries do not store the same layer multiple times for no good reason.

我们对其进行了集成,并使用了 ./gradlew jibDockerBuild ,它似乎确实为依赖项,资源和类创建了层次,但是仍然只有一个大形象.

We integrated it and use ./gradlew jibDockerBuild and it does seem to create layers for the dependencies, resources, and classes, but there is still just one big image.

是的.图像大小将为250MB.当您使用 Dockerfile 或任何其他图像构建工具时,这仍然适用.但是,在使用Jib时,如果仅更改应用程序 .java 文件,则在重建时,Jib将通过网络仅将小型应用程序层(不包含共享库或资源)发送到远程注册表.;它不会发送整个250MB的层,因为Jib保持了强大的可重复性.同样,如果仅更新共享库,Jib将仅发送库层,从而节省时间,带宽和存储空间.

Yes. The image size will be 250MB. This will still be true when you use Dockerfile or any other image building tools. However, when using Jib, if you change only your application .java files, Jib will send only the small application layer (that does not contain shared libraries or resources) over the network to a remote registry when rebuilding; it doesn't send the whole 250MB of layers, because Jib keeps strong reproducibility. Similarly, if you only update your shared libraries, Jib will send only the libraries layer, saving time, bandwidth, and storage.

但是请注意,由于Docker引擎API的功能有限,Jib无法检查Docker引擎中是否已存储某些层,因此使用时,Jib必须加载整个250MB的层jibDockerBuild .这通常不是问题,因为加载是在本地完成的,而无需通过网络.但是由于该API的限制,令人惊讶的是,对于Jib而言,将映像直接推送到远程注册表通常比本地Docker引擎更快.Jib仅需要发送已更改的图层.但是,正如我多次强调的那样,即使Jib(或任何其他图像构建工具)将整个250MB的层加载到Docker引擎中,该引擎也只会保存必要的内容(即,它从未见过的新层-还是这样认为).它不会复制基本映像或共享库层;只有新的不同层会占用存储空间.而且,使用 Dockerfile ,您通常最终会生成新层",即使由于可复制性差而实际上并不是新层.

Note, however, due to the limited capability of the Docker engine API that lacks a way for Jib to check if certain layers are already stored in a Docker engine, Jib has to load the whole 250MB of layers when using jibDockerBuild. This is usually not an issue, because loading is done locally without going through network. But because of this API limitation, surprisingly it is often faster for Jib to directly push an image to a remote registry than to a local Docker engine; Jib only needs to send layers that have been changed. However, as I have stressed multiple times, even if Jib (or any other image building tools) load the whole 250MB of layers into a Docker engine, the engine will save only what are necessary (i.e., new layers that it had never seen–or it believes so). It won't duplicate the base image or the shared libraries layers; only new, different layers will take up storage. And with Dockerfile, you'll usually end up generating "new layers", even though they are practically not new because of poor reproducibility.

这篇关于使用Jib for Spring Boot优化Docker存储库中的映像存储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆