并行 Pip 安装 [英] Parallel Pip install

查看:24
本文介绍了并行 Pip 安装的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们的 Django 项目变得越来越庞大.我们有数百个应用程序并使用了大量的 3rd 方 python 包,其中许多需要编译 C.当我们需要为主要版本创建新的虚拟环境时,我们的部署需要很长时间.话虽如此,我希望加快速度,从 Pip 开始.有谁知道可以并行安装软件包的 Pip 分支?

Our Django project is getting huge. We have hundreds of apps and use a ton of 3rd party python packages, many of which need to have C compiled. Our deployments are taking a long time when we need to create a new virtual environment for major releases. With that said, I'm looking to speed things up, starting with Pip. Does anyone know of a fork of Pip that will install packages in parallel?

我目前采取的步骤:

  • 我一直在寻找这样一个项目,但收效甚微.我确实找到了这个 Github Gist:https://gist.github.com/1971720 但结果几乎和我们的一样单线程朋友.

  • I've looked for a project that does just this with little success. I did find this Github Gist: https://gist.github.com/1971720 but the results are almost exactly the same as our single threaded friend.

然后我在 Github 上找到了 Pip 项目,并开始浏览分叉网络,看看我是否能找到任何提到做我想做的事情的提交.里面很乱.如果需要,我会分叉它并尝试自己并行化它,我只是想避免花时间这样做.

I then found the Pip project on Github and started looking through the network of forks to see if I could find any commits that mentioned doing what I'm trying to do. It's a mess in there. I will fork it and try to parallelize it myself if I have to, I just want to avoid spending time doing that.

我在 ep.io 的 DjangoCon 2011 上看到了一个关于他们部署内容的演讲,他们提到并行化 pip、传送 .so 文件而不是编译 C 和镜像 Pypi,但他们没有提到他们是如何做到的或他们使用的东西.

I saw a talk at DjangoCon 2011 from ep.io explaining their deployment stuff and they mention parallelizing pip, shipping .so files instead of compiling C and mirroring Pypi, but they didn't touch on how they did it or what they used.

推荐答案

你有没有分析过部署过程,看看时间到底去哪儿了?令我惊讶的是,运行多个并行 pip 进程并没有大大加快速度.

Have you analyzed the deployment process to see where the time really goes? It surprises me that running multiple parallel pip processes does not speed it up much.

如果时间用于查询 PyPI 和查找包(特别是当您还从 Github 和其他来源下载时),那么设置您自己的 PyPI 可能会有所帮助.您可以自己托管 PyPI 并将以下内容添加到您的 requirements.txt 文件 (文档):

If the time goes to querying PyPI and finding the packages (in particular when you also download from Github and other sources) then it may be beneficial to set up your own PyPI. You can host PyPI yourself and add the following to your requirements.txt file (docs):

--extra-index-url YOUR_URL_HERE

或者如果您想完全替换官方 PyPI,请执行以下操作:

or the following if you wish to replace the official PyPI altogether:

--index-url YOUR_URL_HERE

这可能会加快下载速度,因为现在可以在附近的机器上找到所有包.

This may speed up download times as all packages are now found on a nearby machine.

很多时间也用于使用 C 代码编译包,例如 PIL.如果事实证明这是瓶颈,那么值得研究在多个进程中编译代码.你甚至可以在你的机器之间共享编译后的二进制文件(但很多东西需要相似,比如操作系统、CPU 字长等)

A lot of time also goes into compiling packages with C code, such as PIL. If this turns out to be the bottleneck then it's worth looking into compiling code in multiple processes. You may even be able to share compiled binaries between your machines (but many things would need to be similar, such as operating system, CPU word length, et cetera)

这篇关于并行 Pip 安装的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆