Snakemake包装器中(bio)conda版本的最佳做法? [英] Best practices for (bio)conda versions in Snakemake wrappers?

查看:43
本文介绍了Snakemake包装器中(bio)conda版本的最佳做法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用conda在Snakemake包装器中指定包装的最佳 environment.yml 最佳实践是什么?我了解这些渠道应该是:

What would be best environment.yml practices for specifying packages in Snakemake wrappers using conda? I understand that the channels should be:

channels:    
  - conda-forge
  - bioconda
  - base

但是,指定软件包的最佳选择是什么?我没有指定版本吗?完整版本?

However, what is a good choice for specifying packages? Do I specify no version? Full versions?

使用完整版本之前曾导致使用无限/超长conda环境的问题.但是,不固定版本会带来隐式升级到软件包不兼容版本的风险.

Using full versions has led to using infinite/super long conda environment resoultion problems before. However, not pinning versions gives the risk of implicitely upgrading to an incompatible version of a package.

我仅指定直接依赖项,还是应该将 conda env export 的输出放在此处,以便冻结所有内容?

Do I specify only direct dependencies or should I put the output of conda env export there so everything is frozen?

推荐答案

对于软件包版本,我通常会选择固定主要版本和次要版本.这样,用户在创建环境时将获得最新的安全补丁和错误修复,而任何事情都不应以向后不兼容的方式进行更改(只要开发人员正确遵循

For package version numbers, I would usually opt for pinning the major and minor version. This way, users will get the newest security patches and bug fixes whenever they create an environment, while nothing should change in a backward incompatible way (wherever developers properly follow semantic versioning).

此外,我将仅指定直接依赖项,并让环境求解器处理所有隐式依赖项.这提供了一定的自由度,可以满足不同软件包的不同需求,而通常,软件包的配方应指定对特定版本的任何限制.

Also, I would only specify direct dependencies and let the environment solver handle any implicit dependencies. This provides a certain level of freedom to meet different needs for different packages, while usually the packages' recipes should specify any restrictions to particular versions.

避免(未来)冲突并保持环境快速创建的另一种方法是保持环境尽可能小和细化(

Another way to avoid (future) conflicts and keep environment creation quick, is to keep environments as small and granular as possible (see Johannes' comment below). If different rules share only some dependencies but not others, I would rather create separate minimal environments for each rule than reuse a bigger environment. Snakemake wrappers will do this anyways, as each wrapper has its own environment definition.

正如约翰内斯所指出的,适用于频道:仅指定您实际使用的频道,而不再需要指定 base 频道.而且,当使用 mamba 时,您可以将 bioconda 指定为第一个频道.

As Johannes pointed out, the same applies to channels: Only specify channels that you are actually using and it is not necessary to specify the base channel any more. And when using mamba, you can specify bioconda as the first channel.

mamba 的讨论:如果速度很重要,我目前会使用 mamba 来解决环境问题-它通常比conda快得多,并且可以更好地确保您获得最新版本的软件包.在中,您可以通过 -conda前端曼巴舞 ,如

Talking of mamba: If speed matters, I would currently use mamba to do the environment solving -- it is usually much faster than conda and is better at ensuring that you get the most up to date version of packages. In snakemake, you can use it via --conda-frontend mamba as also pointed out in Maarten's comment to the question.

但是,当然一切都取决于.如果您知道软件包配方未处理的版本不兼容,则可能需要指定和固定隐式依赖项.如果您拥有创建可以随补丁程序版本更改的输出的软件,那么您当然必须固定补丁程序版本.

But, of course everything always depends. If you have known incompatibilities of versions that are not handled by the packages' recipes, specifying and pinning implicit dependencies can be necessary. If you have software that creates output which can change with a patch version, then you of course have to pin the patch version.

这篇关于Snakemake包装器中(bio)conda版本的最佳做法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆