PyPI 很慢.我如何运行我自己的服务器? [英] PyPI is slow. How do I run my own server?

查看:23
本文介绍了PyPI 很慢.我如何运行我自己的服务器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当新的开发人员加入团队,或者 Jenkins 运行一个完整的构建时,我需要创建一个新的 virtualenv.我经常发现使用 Pip 和大量(超过 10 个)需求设置 virtualenv 需要很长时间才能从 PyPI 安装所有内容.通常它会完全失败:

When a new developer joins the team, or Jenkins runs a complete build, I need to create a fresh virtualenv. I often find that setting up a virtualenv with Pip and a large number (more than 10) of requirements takes a very long time to install everything from PyPI. Often it fails altogether with:

Downloading/unpacking Django==1.4.5 (from -r requirements.pip (line 1))
Exception:
Traceback (most recent call last):
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/basecommand.py", line 107, in main
    status = self.run(options, args)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/commands/install.py", line 256, in run
    requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/req.py", line 1018, in prepare_files
    self.unpack_url(url, location, self.is_download)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/req.py", line 1142, in unpack_url
    retval = unpack_http_url(link, location, self.download_cache, self.download_dir)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/download.py", line 463, in unpack_http_url
    download_hash = _download_url(resp, link, temp_location)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/download.py", line 380, in _download_url
    chunk = resp.read(4096)
  File "/usr/lib64/python2.6/socket.py", line 353, in read
    data = self._sock.recv(left)
  File "/usr/lib64/python2.6/httplib.py", line 538, in read
    s = self.fp.read(amt)
  File "/usr/lib64/python2.6/socket.py", line 353, in read
    data = self._sock.recv(left)
timeout: timed out

我知道 Pip 的 --use-mirrors 标志,有时我团队中的人已经通过使用 --index-url http://f.pypi 解决了这个问题.python.org/simple(或另一个镜像),直到他们有一个及时响应的镜像.我们在英国,但在德国有 PyPI 镜像,我们从其他网站下载数据没有问题.

I'm aware of Pip's --use-mirrors flag, and sometimes people on my team have worked around by using --index-url http://f.pypi.python.org/simple (or another mirror) until they have a mirror that responds in a timely fashion. We're in the UK, but there's a PyPI mirror in Germany, and we don't have issues downloading data from other sites.

所以,我正在寻找在内部为我们的团队镜像 PyPI 的方法.

So, I'm looking at ways to mirror PyPI internally for our team.

我看过的选项是:

  1. 运行我自己的 PyPI 实例.有官方 PyPI 实现:CheeseShop 以及几个第三方实现,例如:djangopypipypiservera>(见脚注)

  1. Running my own PyPI instance. There's the official PyPI implementation: CheeseShop as well as several third party implementations, such as: djangopypi and pypiserver (see footnote)

这种方法的问题是我对文件上传的完整 PyPI 功能不感兴趣,我只想镜像它提供的内容.

The problem with this approach is that I'm not interested in full PyPI functionality with file upload, I just want to mirror the content it provides.

使用 pep381clientpypi-mirror.

这看起来可以工作,但它需要我的镜像先从 PyPI 下载所有内容.我已经设置了 pep381client 的测试实例,但我的下载速度在 5 Kb/s 和 200 Kb/s(位,而不是字节)之间变化.除非某处有完整 PyPI 存档的副本,否则我需要几周的时间才能拥有一个有用的镜像.

This looks like it could work, but it requires my mirror to download everything from PyPI first. I've set up a test instance of pep381client, but my download speed varies between 5 Kb/s and 200 Kb/s (bits, not bytes). Unless there's a copy of the full PyPI archive somewhere, it will take me weeks to have a useful mirror.

使用 PyPI 循环代理,例如 yopypi.

Using a PyPI round-robin proxy such as yopypi.

这无关紧要,因为 http://pypi.python.org 本身由 几个地理上不同的服务器.

This is irrelevant now that http://pypi.python.org itself consists of several geographically distinct servers.

在开发人员之间复制虚拟环境,或托管 当前项目依赖的文件夹.

Copying around a virtualenv between developers, or hosting a folder of the current project's dependencies.

这不能扩展:我们有几个不同的 Python 项目,它们的依赖项会随着时间的推移(缓慢地)发生变化.一旦任何项目的依赖项发生变化,就必须更新此中央文件夹以添加新的依赖项.不过,复制 virtualenv 比复制包更糟糕,因为任何带有 C 模块的 Python 包都需要为目标系统编译.我们的团队同时拥有 Linux 和 OS X 用户.

This doesn't scale: we have several different Python projects whose dependencies change (slowly) over time. As soon as the dependencies of any project change, this central folder must be updated to add the new dependencies. Copying the virtualenv is worse than copying the packages though, since any Python packages with C modules need to be compiled for the target system. Our team has both Linux and OS X users.

(这看起来仍然是一群坏人的最佳选择.)

(This still looks like the best option of a bad bunch.)

使用智能 PyPI 缓存代理:collective.eggproxy

Using an intelligent PyPI caching proxy: collective.eggproxy

这似乎是一个很好的解决方案,但是 PyPI 上的最后一个版本是 2009 年 并讨论 mod_python.

This seems like it would be a very good solution, but the last version on PyPI is dated 2009 and discusses mod_python.

其他大型 Python 团队是做什么的?快速安装同一套python包的最佳解决方案是什么?

What do other large Python teams do? What's the best solution to quickly install the same set of python packages?

脚注:

  • I've seen the question How to roll my own PyPI?, but that question relates to hosting private code.
  • The Python wiki lists alternative PyPI implementations
  • I've also recently discovered Crate.io but I don't believe that helps me when using Pip.
  • There's a website monitoring PyPI mirror status
  • Some packages on PyPI have their files hosted elsewhere so even a perfect mirror won't help all dependencies

推荐答案

你有共享文件系统吗?

因为我会使用 pip 的缓存设置.这很简单.例如,在/mnt 中创建一个名为 pip-cache 的文件夹.

Because I would use pip's cache setting. It's pretty simple. Make a folder called pip-cache in /mnt for example.

mkdir /mnt/pip-cache

然后每个开发人员都会将以下行放入他们的 pip 配置中 (unix = $HOME/.pip/pip.conf, win = %HOME%pippip.ini)

Then each developer would put the following line into their pip config (unix = $HOME/.pip/pip.conf, win = %HOME%pippip.ini)

[global]
download-cache = /mnt/pip-cache

它仍然检查 PyPi,寻找最新版本.然后检查该版本是否在缓存中.如果是这样,它会从那里安装它.如果不是它下载它.将其存储在缓存中并安装它.因此,每个新版本每个包只能下载一次.

It still checks PyPi, looks for the latest version. Then checks if that version is in the cache. If so it installs it from there. If not it downloads it. Stores it in the cache and installs it. So each package would only be downloaded once per new version.

这篇关于PyPI 很慢.我如何运行我自己的服务器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆