在构建期间填充Postgres Docker映像(未运行) [英] Populating Postgres Docker image during building (not running)

查看:59
本文介绍了在构建期间填充Postgres Docker映像(未运行)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要准备自定义图片(基于官方发布图片),该任务有两个任务:

  1. 下载数据(例如,通过wget获取CSV文件),
  2. 将数据加载到数据库中(创建表,插入).

我想在构建映像期间而不是在运行容器期间执行这两个步骤,因为每个步骤都花费很多时间,并且我想一次构建映像并快速运行许多容器.

我知道如何在构建映像期间执行步骤1(下载数据),但是我不知道如何在构建映像期间将数据加载到数据库中,而不是运行容器(步骤2).

示例:

(下载-在构建映像期间,加载-在运行容器期间)

Dockerfile :

  FROM postgres:10.7运行apt获取更新&&apt-get install -y wget \&&rm -rf/var/lib/apt/lists/*COPY download.sh/download.sh运行/download.sh 

download.sh :

 #!/bin/bashcd/docker-entrypoint-initdb.d/wget https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/northwindextended/northwind.postgre.sql 

要下载数据,我自己运行脚本.要加载数据,我使用来自官方Postgres映像的"初始化脚本实用程序./p>

建筑图片:

  docker build -t mydbimage. 

运行图像:

  docker run --name mydbcontainer -p 5432:5432 -e POSTGRES_PASSWORD = postgres -d mydbimage 

运行后,您可以看到需要多少加载数据:

  docker日志mydbcontainer 

此示例数据集很小,但是长时间运行的容器比较笨拙.

解决方案

您可以剖析上游 Dockerfile 及其 docker-entrypoint.sh ,然后选择所需的代码片段来初始化数据库:

  FROM postgres:10.7ENV PGDATA/var/lib/postgresql/datap-in-image运行mkdir -p"$ PGDATA"&&chown -R postgres:postgres"$ PGDATA"&&chmod 777"$ PGDATA"#此777将在运行时被700取代(允许半任意的"--user"值)运行设置-x \&&apt-get更新&&apt-get install -y --no-install-commends推荐ca证书wget&&&rm -rf/var/lib/apt/lists/* \&&wget https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/northwindextended/northwind.postgre.sql \-O/docker-entrypoint-initdb.d/northwind.postgre.sql \&&cp ./docker-entrypoint.sh ./docker-entrypoint-init-only.sh \&&sed -ri'/exec"\ $ @"/d'./docker-entrypoint-init-only.sh \&&./docker-entrypoint-init-only.sh postgres \&&rm ./docker-entrypoint-initdb.d/northwind.postgre.sql ./docker-entrypoint-init-only.sh \&&apt-get purge -y-自动删除ca证书wget 

构建,运行和测试:

  docker build -t mydbimage.#调出数据库泊坞窗运行--rm mydbimage --name pgtest#在另一个终端运行此命令以检查导入的数据docker exec -ti pgtest psql -v ON_ERROR_STOP = 1 --username"postgres" --no-password --dbname postgres --command"\ d" 

注意事项:

  • 通过此设置,没有为数据库设置没有密码.您可以在构建过程中添加它,但是它将保留在映像中.您需要采取预防措施,以确保没有人可以访问您的图像.根据您的设置,这可能很难实现,甚至可能无法实现.
  • 第二个问题是对数据库的写入是临时.在构建期间没有卷可以保留导入的数据.这就是为什么 PGDATA 更改为未声明为卷的目录的原因.

基本上,这就是为什么在启动容器时而不是在上游存储库中构建时要处理导入的原因.如果您将非机密数据用作只读数据,则在构建过程中导入它仍然很有意义,以节省时间并在容器启动过程中更轻松地进行处理.

I want to prepare custom image (based on offical Postges image) with two tasks:

  1. Download data (eg. get CSV file by wget),
  2. Load data into database (creating tables, inserts).

I want to do both steps during building image, not during running container, because each of them takes a lot of time, and I want to build image once and run many containers quickly.

I know how to do step 1 (download data) during building image, but I don't know how to load data into database during building image instead of run container (step 2).

Example:

(download - during building image, load - during running container)

Dockerfile:

FROM postgres:10.7

RUN  apt-get update \
  && apt-get install -y wget \
  && rm -rf /var/lib/apt/lists/* 

COPY download.sh /download.sh
RUN /download.sh

download.sh:

#!/bin/bash

cd /docker-entrypoint-initdb.d/
wget https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/northwindextended/northwind.postgre.sql

To download data I run script myself. To load data I use "initialization scripts" utility from official Postgres image.

Building image:

docker build -t mydbimage .

Running image:

docker run --name mydbcontainer -p 5432:5432 -e POSTGRES_PASSWORD=postgres -d mydbimage 

After running, you can see how much loading data takes time:

docker logs mydbcontainer

This example dataset is small, but with bigger, long time running container is awkward.

解决方案

You can dissect the upstream Dockerfile and its docker-entrypoint.sh and just pick the needed snippets to initialize your database:

FROM postgres:10.7

ENV PGDATA /var/lib/postgresql/datap-in-image
RUN mkdir -p "$PGDATA" && chown -R postgres:postgres "$PGDATA" && chmod 777 "$PGDATA" # this 777 will be replaced by 700 at runtime (allows semi-arbitrary "--user" values)

RUN set -x \
  && apt-get update && apt-get install -y --no-install-recommends ca-certificates wget && rm -rf /var/lib/apt/lists/* \
  && wget https://storage.googleapis.com/google-code-archive-downloads/v2/code.google.com/northwindextended/northwind.postgre.sql \ 
    -O /docker-entrypoint-initdb.d/northwind.postgre.sql \
  && cp ./docker-entrypoint.sh ./docker-entrypoint-init-only.sh \
  && sed -ri '/exec "\$@"/d' ./docker-entrypoint-init-only.sh \
  && ./docker-entrypoint-init-only.sh postgres \
  && rm ./docker-entrypoint-initdb.d/northwind.postgre.sql ./docker-entrypoint-init-only.sh \
  && apt-get purge -y --auto-remove ca-certificates wget

Build, run and test:

docker build -t mydbimage .

# bring up the database
docker run --rm mydbimage --name pgtest

# run this in another terminal to check for the imported data 
docker exec -ti pgtest psql -v ON_ERROR_STOP=1 --username "postgres" --no-password --dbname postgres --command "\d"

Caveats:

  • With this setup there is no password set for the database. You can add it during the build, but then it would be persisted in the image. You would need to take precaution that no one gets access to your image. Depending on your setup this might be hard to achieve, maybe that is even impossible.
  • The second problem is that writes to your database are ephemeral. There is no volume during building to persist the imported data. That is why PGDATA is changed to a directory that is not declared as a volume.

Basically these are the reasons why importing is handled when starting the container instead of during building in the upstream repository. In case you have non secret data that is used as read only it might still make sense to import during the build to save time and for easier handling during the startup of the container.

这篇关于在构建期间填充Postgres Docker映像(未运行)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆