为什么要安装软件包而不是仅仅链接到特定环境? [英] Why are packages installed rather than just linked to a specific environment?

查看:26
本文介绍了为什么要安装软件包而不是仅仅链接到特定环境?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到,通常当使用各种包管理器(对于 python)安装包时,它们安装在 /home/user/anaconda3/envs/env_name/ 上的 conda 和 /home/user/anaconda3/envs/env_name/lib/python3.6/lib-packages/ 在 conda 上使用 pip.

I've noticed that normally when packages are installed using various package managers (for python), they are installed in /home/user/anaconda3/envs/env_name/ on conda and in /home/user/anaconda3/envs/env_name/lib/python3.6/lib-packages/ using pip on conda.

但是 conda 也会缓存所有最近下载的包.

But conda caches all the recently downloaded packages too.

所以,我的问题是:为什么 conda 不将所有软件包安装在一个中央位置,然后在特定环境中安装时创建指向该目录的链接而不是将其安装在那里?

So, my question is: Why doesn't conda install all the packages on a central location and then when installed in a specific environment create a link to the directory rather than installing it there?

我注意到环境变得非常大,这种方法可能会节省一些空间.

I've noticed that environments grow quite big and that this method would probably be able to save a bit of space.

推荐答案

Conda 已经这样做了.然而,由于它利用了硬链接,很容易高估实际使用的空间,特别是如果一次只查看单个环境的大小.

Conda already does this. However, because it leverages hardlinks, it is easy to overestimate the space really being used, especially if one only looks at the size of a single env at a time.

为了说明这个案例,让我们使用 du 来检查实际的磁盘使用情况.首先,如果我单独计算每个环境目录,我会得到未更正的每个 env 使用情况

To illustrate the case, let's use du to inspect the real disk usage. First, if I count each environment directory individually, I get the uncorrected per env usage

$ for d in envs/*; do du -sh $d; done
2.4G    envs/pymc36
1.7G    envs/pymc3_27
1.4G    envs/r-keras
1.7G    envs/stan
1.2G    envs/velocyto

这可能是 GUI 中的样子.

which is what it might look like from a GUI.

相反,如果我让 du 将它们一起计算(即纠正硬链接),我们得到

Instead, if I let du count them together (i.e., correcting for the hardlinks), we get

$ du -sh envs/*
2.4G    envs/pymc36
326M    envs/pymc3_27
820M    envs/r-keras
927M    envs/stan
548M    envs/velocyto

可以看到这里已经节省了大量空间.

One can see that a significant amount of space is already being saved here.

大多数硬链接都返回到 pkgs 目录,所以如果我们也包括它:

Most of the hardlinks go back to the pkgs directory, so if we include that as well:

$ du -sh pkgs envs/*
8.2G    pkgs
400M    envs/pymc36
116M    envs/pymc3_27
 92M    envs/r-keras
 62M    envs/stan
162M    envs/velocyto

可以看到,在共享包之外,环境相当轻量.如果你担心我的 pkgs 的大小,请注意我从来没有在这个系统上运行过 conda clean,所以我的 pkgs 目录是充满了 tarball 和被取代的包,以及我在基础中保留的一些基础设施(例如,Jupyter、Git 等).

one can see that outside of the shared packages, the envs are fairly light. If you're concerned about the size of my pkgs, note that I have never run conda clean on this system, so my pkgs directory is full of tarballs and superseded packages, plus some infrastructure I keep in base (e.g., Jupyter, Git, etc).

这篇关于为什么要安装软件包而不是仅仅链接到特定环境?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆