PyPI很慢.如何运行自己的服务器? [英] PyPI is slow. How do I run my own server?

查看:184
本文介绍了PyPI很慢.如何运行自己的服务器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当新的开发人员加入团队或Jenkins运行完整的构建时,我需要创建一个新的virtualenv.我经常发现,使用Pip和大量(超过10个)需求设置virtualenv会花费很长时间才能安装PyPI中的所有内容. 通常,它完全失败:

When a new developer joins the team, or Jenkins runs a complete build, I need to create a fresh virtualenv. I often find that setting up a virtualenv with Pip and a large number (more than 10) of requirements takes a very long time to install everything from PyPI. Often it fails altogether with:

Downloading/unpacking Django==1.4.5 (from -r requirements.pip (line 1))
Exception:
Traceback (most recent call last):
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/basecommand.py", line 107, in main
    status = self.run(options, args)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/commands/install.py", line 256, in run
    requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/req.py", line 1018, in prepare_files
    self.unpack_url(url, location, self.is_download)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/req.py", line 1142, in unpack_url
    retval = unpack_http_url(link, location, self.download_cache, self.download_dir)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/download.py", line 463, in unpack_http_url
    download_hash = _download_url(resp, link, temp_location)
  File "/var/lib/jenkins/jobs/hermes-web/workspace/web/.venv/lib/python2.6/site-packages/pip-1.2.1-py2.6.egg/pip/download.py", line 380, in _download_url
    chunk = resp.read(4096)
  File "/usr/lib64/python2.6/socket.py", line 353, in read
    data = self._sock.recv(left)
  File "/usr/lib64/python2.6/httplib.py", line 538, in read
    s = self.fp.read(amt)
  File "/usr/lib64/python2.6/socket.py", line 353, in read
    data = self._sock.recv(left)
timeout: timed out

我知道Pip的--use-mirrors标志,有时我们团队中的人已经通过使用--index-url http://f.pypi.python.org/simple(或其他镜像)进行了变通,直到他们拥有及时响应的镜像.我们在英国,但是在德国有一个PyPI镜像,并且从其他站点下载数据没有问题.

I'm aware of Pip's --use-mirrors flag, and sometimes people on my team have worked around by using --index-url http://f.pypi.python.org/simple (or another mirror) until they have a mirror that responds in a timely fashion. We're in the UK, but there's a PyPI mirror in Germany, and we don't have issues downloading data from other sites.

因此,我正在寻找为我们的团队内部镜像PyPI的方法.

So, I'm looking at ways to mirror PyPI internally for our team.

我看过的选项是:

  1. 运行我自己的PyPI实例.有官方的PyPI实施: CheeseShop 以及一些第三方实施,例如: pypiserver (请参阅脚注)

  1. Running my own PyPI instance. There's the official PyPI implementation: CheeseShop as well as several third party implementations, such as: djangopypi and pypiserver (see footnote)

这种方法的问题是,我对文件上传的完整PyPI功能不感兴趣,我只想镜像它提供的内容.

The problem with this approach is that I'm not interested in full PyPI functionality with file upload, I just want to mirror the content it provides.

使用 pep381client 这看起来可能可行,但是需要我的镜像首先从PyPI下载所有内容.我已经设置了pep381client的测试实例,但是我的下载速度在5 Kb/s和200 Kb/s之间变化(位,而不是字节).除非某个地方有完整的PyPI档案的副本,否则要花几周才能拥有一个有用的镜像.

This looks like it could work, but it requires my mirror to download everything from PyPI first. I've set up a test instance of pep381client, but my download speed varies between 5 Kb/s and 200 Kb/s (bits, not bytes). Unless there's a copy of the full PyPI archive somewhere, it will take me weeks to have a useful mirror.

使用PyPI轮询代理,例如 yopypi .

Using a PyPI round-robin proxy such as yopypi.

这无关紧要,因为 http://pypi.python.org 本身由

This is irrelevant now that http://pypi.python.org itself consists of several geographically distinct servers.

在开发人员之间围绕virtualenv复制,或托管当前项目依赖项的文件夹.

Copying around a virtualenv between developers, or hosting a folder of the current project's dependencies.

这不能扩展:我们有几个不同的Python项目,它们的依赖关系随时间变化(缓慢).任何项目的依存关系一旦更改,就必须更新此中央文件夹以添加新的依存关系.但是,复制virtualenv比复制软件包更糟糕,因为任何带有C模块的Python软件包都需要针对目标系统进行编译.我们的团队同时拥有Linux和OS X用户.

This doesn't scale: we have several different Python projects whose dependencies change (slowly) over time. As soon as the dependencies of any project change, this central folder must be updated to add the new dependencies. Copying the virtualenv is worse than copying the packages though, since any Python packages with C modules need to be compiled for the target system. Our team has both Linux and OS X users.

(这似乎仍然是坏一堆的最佳选择.)

(This still looks like the best option of a bad bunch.)

使用智能PyPI缓存代理: collective.eggproxy

Using an intelligent PyPI caching proxy: collective.eggproxy

这似乎是一个很好的解决方案,但是 PyPI的最新版本是2009年并讨论mod_python.

This seems like it would be a very good solution, but the last version on PyPI is dated 2009 and discusses mod_python.

其他大型Python团队做什么?快速安装同一组python软件包的最佳解决方案是什么?

What do other large Python teams do? What's the best solution to quickly install the same set of python packages?

脚注:

  • I've seen the question How to roll my own PyPI?, but that question relates to hosting private code.
  • The Python wiki lists alternative PyPI implementations
  • I've also recently discovered Crate.io but I don't believe that helps me when using Pip.
  • There's a website monitoring PyPI mirror status
  • Some packages on PyPI have their files hosted elsewhere so even a perfect mirror won't help all dependencies

推荐答案

您有共享的文件系统吗?

Do you have a shared filesystem?

因为我会使用pip的缓存设置.很简单例如,在/mnt中创建一个名为pip-cache的文件夹.

Because I would use pip's cache setting. It's pretty simple. Make a folder called pip-cache in /mnt for example.

mkdir /mnt/pip-cache

然后,每个开发人员都将以下行放入他们的pip配置中(unix = $ HOME/.pip/pip.conf,win =%HOME%\ pip \ pip.ini)

Then each developer would put the following line into their pip config (unix = $HOME/.pip/pip.conf, win = %HOME%\pip\pip.ini)

[global]
download-cache = /mnt/pip-cache

它仍会检查PyPi,寻找最新版本.然后检查该版本是否在缓存中.如果是这样,它将从那里安装.如果没有下载.将其存储在缓存中并安装.因此,每个软件包每个新版本只能下载一次.

It still checks PyPi, looks for the latest version. Then checks if that version is in the cache. If so it installs it from there. If not it downloads it. Stores it in the cache and installs it. So each package would only be downloaded once per new version.

这篇关于PyPI很慢.如何运行自己的服务器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆