便携式conda环境作为二进制压缩包 [英] portable conda environment as a binary tarball
问题描述
我正在尝试构建一个可移植的conda
环境.
因此,我们可以打包并将其分发到其他服务器或以后的许多服务器上.
我们构建环境以及以后将部署环境的地方是两个不同的地方.
我注意到conda create
和conda install
都位于environment_path\bin
下所有已安装脚本的硬编码环境位置中,是否可以覆盖?
I am trying to build a portable conda
environment.
So we can tarball and distribute it to another server or many servers later.
Place where we build environment and where it'll be deployed later are two different places.
I noticed that conda create
and conda install
both hard-code environment location into shebang of all the installed scripts under environment_path\bin
- is there is a way to override that?
我们必须使用部署位置,而不必在shebang bin/
脚本中为该环境构建位置.
We have to use deployment location and not build location for that environment in shebang bin/
scripts.
还创建了 https://github.com/conda/conda/issues/7861
免责声明:我知道使用导出的yaml文件重建环境的选项,但这不是我们在这里查找的内容.我们希望使conda环境可以作为二进制tarball重新分发/移植-部署位置已知,但是与环境构建位置不同.
Disclaimer: I'm aware of option of rebuilding environment using exported yaml file, but this is not what we're looking here. We want to make conda environment redistributable / portable as a binary tarball - deployment location is known, but it's not the same as environment build location.
推荐答案
我刚刚发现conda-pack
似乎可以直接解决此问题
I just found conda-pack
that seems address this issue directly
https://github.com/conda/conda-pack
conda-pack
是用于创建可重定位conda的命令行工具 环境.这对于以一致的方式部署代码很有用 环境,可能位于python/conda不在的位置 已经安装.
conda-pack
is a command line tool for creating relocatable conda environments. This is useful for deploying code in a consistent environment, potentially in a location where python/conda isn't already installed.
文档: https://conda.github.io/conda-pack/
用例:
- 将应用程序与其部署环境捆绑在一起
- 打包conda环境,以便在上部署时与Apache Spark一起使用 纱(有关更多信息,请参见此处).
- 打包conda环境以在Apache YARN上进行部署.一种方法是使用Skein.
- 使环境处于正常运行状态.
- Bundling an application with its environment for deployment
- Packaging a conda environment for usage with Apache Spark when deploying on YARN (see here for more information).
- Packaging a conda environment for deployment on Apache YARN. One way to do this is to use Skein.
- Archiving an environment in a functioning state.
更新:我们的其他PySpark应用程序正在所有Hadoop节点(NFS挂载)上可用的位置上使用conda
环境,它对于某些不支持conda环境的环境非常有效.有大量的依赖关系.
Update: Some other of our PySpark applications are using a conda
environment on a location that's available on all Hadoop nodes (NFS mount) and it works very well for some conda environments that don't have a ton of dependencies.
这篇关于便携式conda环境作为二进制压缩包的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!