如何使用git clone --recursive来加快/并行下载git子模块? [英] How to speed up / parallelize downloads of git submodules using git clone --recursive?
问题描述
git clone --recursive https://github.com/Whonix/Whonix
Git逐个克隆他们。比要求要长得多让我们假设(可能)客户端和服务器都有足够的资源同时回答多个(并行)请求。
如何加快/并行化下载git子模块使用 git clone --recursive
?
我运行你的命令,需要338秒的时间才能下载68 Mb。
使用以下依赖于GNU并行的Python程序,
#! / usr / bin / env python
#编码:utf-8
from __future__ import print_function
import os
import subprocess
jobs = 16
modules_file ='.gitmodules'
packages = []
如果不是os.path.exists('Whonix /'+ modules_file):
subprocess.call(['git','clone','https://github.com/Whonix/Whonix'])
os.chdir 'wonix')
#从.gitmodules文件获取包的列表
with open(modules_file)as ifp:
if if中的行:
如果不是行。 startswith('[submodule'):
continue
package = line.split(''',1)[1] .split('',1)[0]
#print (package)
packages.append(package)
def doit():
p = subprocess.Popen(['parallel','-N1','-j {0 }'。format(jobs),
'git','submodule','update','--init',
':::'],
stdin = subprocess.PIPE,stdout = subprocess.PIPE)
res = p.communicate('\\\
'.join(packages))
print(res [0])
如果res [1]:
print(error,res [1])$ b $ b print('git exit value',p.returncode)
return p.returncode
#有时其中一个更新会干扰别人,并产生锁
#错误,所以我们在范围(10)中重试
:
如果doit()== 0:
print('0退出git after {0} times'.format(x + 1))
break
else:
print('不能得到非零退出git after {0} times'.format(
x + 1))
那个时间减少到45秒(在同一个系统上,我做了不多次运行来平均出现波动)。
要检查是否有事情好的,我比较了检出的文件:
找到whonix -name。 git-prune -o-type f -print0 | xargs -0 md5sum> /tmp/md5.sum
在一个目录中,
md5sum -c / tmp / md5sum
在另一个目录中,反之亦然。
Cloning git repositories that have a lot submodules takes a really long time. In the following example are ~100 submodules
git clone --recursive https://github.com/Whonix/Whonix
Git clones them one by one. Takes much longer than required. Let's make the (probable) assumption that both the client and the server has sufficient resources to answer multiple (parallel) requests at the same time.
How to speed up / parallelize downloads of git submodules using git clone --recursive
?
When I run your command it takes 338 seconds wall-time for downloading the 68 Mb.
With the following Python program that relies on GNU parallel to be installed,
#! /usr/bin/env python
# coding: utf-8
from __future__ import print_function
import os
import subprocess
jobs=16
modules_file = '.gitmodules'
packages = []
if not os.path.exists('Whonix/' + modules_file):
subprocess.call(['git', 'clone', 'https://github.com/Whonix/Whonix'])
os.chdir('Whonix')
# get list of packages from .gitmodules file
with open(modules_file) as ifp:
for line in ifp:
if not line.startswith('[submodule '):
continue
package = line.split(' "', 1)[1].split('"', 1)[0]
#print(package)
packages.append(package)
def doit():
p = subprocess.Popen(['parallel', '-N1', '-j{0}'.format(jobs),
'git', 'submodule', 'update', '--init',
':::'],
stdin=subprocess.PIPE, stdout=subprocess.PIPE)
res = p.communicate('\n'.join(packages))
print(res[0])
if res[1]:
print("error", res[1])
print('git exit value', p.returncode)
return p.returncode
# sometimes one of the updates interferes with the others and generate lock
# errors, so we retry
for x in range(10):
if doit() == 0:
print('zero exit from git after {0} times'.format(x+1))
break
else:
print('could not get a non-zero exit from git after {0} times'.format(
x+1))
that time is reduced to 45 seconds (on the same system, I did not do multiple runs to average out fluctuations).
To check if things were OK, I "compared" the checked out files with:
find Whonix -name ".git" -prune -o -type f -print0 | xargs -0 md5sum > /tmp/md5.sum
in the one directory and
md5sum -c /tmp/md5sum
in the other directory and vice versa.
这篇关于如何使用git clone --recursive来加快/并行下载git子模块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!