git并行拉多个遥控器 [英] git pull multiple remotes in parallel

查看:147
本文介绍了git并行拉多个遥控器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几千个遥控器的回购,而且我想同时从成千上万的遥控器中拉出来,理想情况下我可以同时指定一个最大值。



我无法在联机帮助页面,谷歌或git-scm在线找到与此相关的任何内容。



要清楚的是:我不想在多个repos上运行一个命令,我有一个repo与成千上万的遥控器。



这与子模块无关,不要谈论子模块。子模块与git遥控器无关。

解决方案

我很确定你必须编写自己的代码来做到这一点。 p>

由于 CodeWizard在一个评论,Git需要锁定存储库的部分。如果您在单个存储库中并行运行多个 git fetch 进程,那么这些锁中的某些有时会被冲突。



您可能还需要某种远程排序策略,例如从 remoteA remoteB remoteC 并行可能会在 remoteB 上发现10000个常用对象,而如果 remoteB 通常(但不总是) remoteA remoteC 的超集。 sup> 1 虽然这也适用于连续的 git fetch 操作,但它变得不那么重要。例如,假设有5个对象 - 一些你还没有的A,一些树,一些树,和一些Blob,C上有5000个对象,而在B上有10000个对象。如果以任何顺序顺序获取,你拿起5k,然后5k,然后0;或10k,然后0,则0;因为当您移动到下一个远程时,您已经收集并存储了5k或10k个传入对象。但是,如果你三个并行执行,那么你将会将5k,5k和10k的对象加入,只有然后发现你的工作量翻了一番。






1 如果B总是一个超集,首先(顺序)去B,然后转到A和C并行,仅供参考,这将指向您现在拥有的对象。


I have a repo with thousands of remotes, and I'd like to pull from thousands of remotes at the same time, ideally I can specify a maximum number to do at the same time.

I wasn't able to find anything related to this in the manpages, google, or git-scm online.

To be perfectly clear: I do not want to run one command over multiple repos, I have one repo with thousands of remotes.

This has nothing to do with submodules, don't talk about submodules. Submodules are unrelated to git remotes.

解决方案

I'm pretty sure you have to write your own code to do this.

As CodeWizard says in a comment, Git needs to lock parts of the repository. Some of these locks are bound to collide at times, if you simply run multiple git fetch processes in parallel within a single repository.

You might also want some kind of remote-ordering strategy since, e.g., collecting from remoteA, remoteB, and remoteC in parallel may discover 10000 common objects on remoteB as compared to the other two if remoteB is generally (but not always) a superset of remoteA and remoteC.1 While this also applies to sequential git fetch operations, it becomes considerably less important. Suppose, for example, that there are 5000 objects—some commits, some trees, and some blobs—on A that you do not yet have, 5000 others on C, and all 10000 on B. If you fetch sequentially, in any order, you pick up either 5k, then 5k, then 0; or 10k, then 0, then 0; because by the time you move to the next remote, you have collected and stored the 5k or 10k incoming objects. But if you do all three in parallel, you will bring 5k, 5k, and 10k objects in, and only then discover that you have doubled your workload.


1If B is always a superset, simply go to B first (sequentially), then go to A and C in parallel solely for their references, which will point to objects you now have.

这篇关于git并行拉多个遥控器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆