python的dsl.ContainerOp [英] dsl.ContainerOp with python

查看:216
本文介绍了python的dsl.ContainerOp的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有什么选项可以将.py文件下载到执行环境中?

What are the options to download .py files into the execution environment?

在此示例中:

class Preprocess(dsl.ContainerOp):

  def __init__(self, name, bucket, cutoff_year):
    super(Preprocess, self).__init__(
      name=name,
      # image needs to be a compile-time string
      image='gcr.io/<project>/<image-name>/cpu:v1',
      command=['python3', 'run_preprocess.py'],
      arguments=[
        '--bucket', bucket,
        '--cutoff_year', cutoff_year,
        '--kfp'
      ],
      file_outputs={'blob-path': '/blob_path.txt'}
    )

正在从CLI调用

run_preprocess.py文件.

run_preprocess.py file is being called from CLI.

问题是:如何在其中获取文件 ?

The question is: how to get that file in there?

我已经看到了一个有趣的示例: https://github. com/benjamintanweihao/kubeflow-mnist/blob/master/pipeline.py ,并在运行管道之前先克隆代码.

I have seen this interesting example: https://github.com/benjamintanweihao/kubeflow-mnist/blob/master/pipeline.py , and it clones the code before running the pipeline.

另一种方法是使用Dockerfile进行git克隆(尽管要永久构建映像).

The other way would be git cloning with Dockerfile (although the image would take forever to build).

还有哪些其他选择?

推荐答案

要使用python启动KFP开发,请尝试以下教程:

To kickstart KFP development using python, try the following tutorial: Data passing in python components

它会在运行管道之前克隆代码 另一种方法是使用Dockerfile进行git克隆(尽管要永久构建映像)

it clones the code before running the pipeline The other way would be git cloning with Dockerfile (although the image would take forever to build)

理想情况下,文件应位于容器映像内(Dockerfile方法).这样可以确保最大的可重复性.

Ideally, the files should be inside the container image (the Dockerfile method). This ensures maximum reproducibility.

对于不太复杂的python脚本,轻量级python组件功能允许您从python函数创建组件.在这种情况下,脚本代码存储在组件命令行中,因此您无需将代码上传到任何地方.

For not very complex python scripts, the Lightweight python component feature allows you to create component from a python function. In this case the script code is store in the component command-line, so you do not need to upload the code anywhere.

可以将脚本放置在远程位置(例如云存储或网站),但是会降低可靠性和可重复性.

Putting scripts somewhere remote (e.g. cloud storage or website) is possible, but can reduce reliability and reproducibility.

P.S.

尽管图像将永远需要构建

although the image would take forever to build

不应该.第一次,由于必须拉基本图像,它可能会很慢,但是之后,它应该很快,因为只推送了新图层. (这需要选择一个安装了所有依赖项的良好基础映像,因此您的Dockerfile仅添加您的脚本).

It shouldn't. The first time it might be slow due to having to pull the base image, but after that it should be fast since only the new layers are being pushed. (This requires choosing a good base image that has all dependencies installed, so your Dockerfile only adds your scripts).

这篇关于python的dsl.ContainerOp的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆