如何创建同时运行Python和R的Docker映像? [英] How can I create a Docker image to run both Python and R?

查看:169
本文介绍了如何创建同时运行Python和R的Docker映像?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将主要在Python中开发的代码管道进行容器化,但它依赖于R中训练的模型。这两个代码库都需要一些要求和程序包。我该如何创建一个Docker映像,以允许我构建一个可以同时运行此Python和R代码的容器?

I want to containerise a pipeline of code that was predominantly developed in Python but has a dependency on a model that was trained in R. There are some additional dependencies on the requirements and packages needed for both codebases. How can I create a Docker image that allows me to build a container that will run this Python and R code together?

对于上下文,我有一个R代码可以运行模型(随机森林),但它必须是用Python构建的数据管道的一部分。 Python管道首先执行一些功能,并为模型生成输入,然后使用该输入执行R代码,然后再将输出带到Python管道的下一阶段。

For context, I have an R code that runs a model (random forest) but it needs to be part of a data pipeline that was built in Python. The Python pipeline performs some functionality first and generates input for the model, then executes the R code with that input, before taking the output to the next stage of the Python pipeline.

因此,我通过编写一个简单的测试Python函数来调用R代码(导入了子过程包的 test_call_r.py)为该过程创建了一个模板,需要将其放入具有针对Python和R的必要要求和程序包的Docker容器中。

So I've created a template for this process by writing a simple test Python function to call an R code ("test_call_r.py" which imports the subprocess package) and need to put this in a Docker container with the necessary requirements and packages for both Python and R.

我已经能够为Python管道本身构建Docker容器,但无法成功安装R和相关的软件包以及Python要求。我想重写Dockerfile来创建映像来执行此操作。

I have been able to build the Docker container for the Python pipeline itself, but cannot successfully install R and the associated packages alongside the Python requirements. I want to rewrite the Dockerfile to create an image to do this.

从Dockerhub文档中,我可以使用

From the Dockerhub documentation I can create an image for the Python pipeline using, e.g.,

FROM python:3
WORKDIR /app
COPY requirements.txt /app/
RUN pip install --no-cache-dir -r requirements.txt
COPY . /app
CMD [ "python", "./test_call_r.py" ]

与Dockerhub类似,我可以使用基本R映像(或Rocker)创建可以运行randomForest模型的Docker容器,例如,

And similarly from Dockerhub I can use a base R image (or Rocker) to create a Docker container that can run a randomForest model, e.g.,

FROM r-base
WORKDIR /app    
COPY myscripts /app/
RUN Rscript -e "install.packages('randomForest')"
CMD ["Rscript", "myscript.R"] 

但是我需要创建一个可以安装需求和封装Python和R,并执行代码库以从Python中的子进程运行R。我怎样才能做到这一点?

But what I need is to create an image that can install the requirements and packages for both Python and R, and execute the codebase to run R from a subprocess in Python. How can I do this?

推荐答案

我为Python和R构建的以这种方式与它们的依赖项一起运行的Dockerfile是:

The Dockerfile I built for Python and R to run together with their dependencies in this manner is:

FROM ubuntu:latest

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y --no-install-recommends build-essential r-base r-cran-randomforest python3.6 python3-pip python3-setuptools python3-dev

WORKDIR /app

COPY requirements.txt /app/requirements.txt

RUN pip3 install -r requirements.txt

RUN Rscript -e "install.packages('data.table')"

COPY . /app

用于构建映像,运行容器(在此处将其命名为SnakeR)并执行的命令代码是:

The commands to build the image, run the container (naming it SnakeR here), and execute the code are:

docker build -t my_image .
docker run -it --name SnakeR my_image
docker exec SnakeR /bin/sh -c "python3 test_call_r.py"

我将其视为Ubuntu操作系统,并按如下所示构建映像:

I treated it like a Ubuntu OS and built the image as follows:


  • 禁止显示提示用于在R安装过程中选择您的位置;

  • 更新apt-get;

  • 设置以下安装条件:


    • y =是,提示用户进行下一步操作(例如内存分配) ;

    • suppress the prompts for choosing your location during the R install;
    • update the apt-get;
    • set installation criteria of:
      • y = yes to user prompts for proceeding (e.g. memory allocation);

      • 仅安装建议的依赖性,而不建议的依赖性;

      我的博客文章 https:// datascienceunicorn。 tumblr.com/post/182297983466/building-a-docker-to-run-python-r

      这篇关于如何创建同时运行Python和R的Docker映像?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆