如何处理一个大的git仓库? [英] How to handle a large git repository?

查看:136
本文介绍了如何处理一个大的git仓库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前使用git作为大型存储库(大约12 GB,每个分支大小为3 GB)。
这个存储库包含大量的二进制文件(音频和图像)。



问题是克隆和拉取可能需要很长时间。
特别是解决三角洲步骤可能非常长。



解决这类问题的最佳方法是什么?



我尝试删除增量压缩,因为它使用增量选项来解释),以及版本只是一个文本文件,其中声明您需要的版本。<
使用工件存储库比在源代码库中存储二进制元素更容易(用于比较版本和分支之间的合并,这对于所述二进制文件不太有用) 。



另一个解决方案,更以git为中心,是 git-annex


git-annex 允许使用git管理文件,而无需将文件内容检入git。

虽然这看起来似乎有些矛盾,但当处理大于git的文件时容易处理,无论是由于内存,时间或磁盘空间的限制。


但它与Windows不兼容。



更通用的解决方案可以是 git-media ,它还允许您将Git与大型媒体文件一起使用,而无需将媒体存储在Git中。

最后,最简单的解决方案是如您在您的问题中提到的,将这些二进制文件隔离在自己的 git submodule 中:它不是很满意,初始的克隆仍然会需要时间,但下一次的家长回购更新将会很短。


I am currently using git for a large repository (around 12 GB, each branch having a size of 3 GB). This repository contains lots of binary files (audio and images).

The problem is that clone and pull can take lots of time. Specially the "Resolving deltas" step can be very very long.

What is the best way to solve this kind of problem?

I tried to remove delta compression, as it it explain here using the delta option in .gitattributes but it seems to not improve the clone duration.

Thanks in advance

Kevin

解决方案

Update April 2015: Git Large File Storage (LFS) (by GitHub).

It uses git-lfs (see git-lfs.github.com) and tested with a server supporting it: lfs-test-server:
You can store metadata only in the git repo, and the large file elsewhere.


Original answer (2012)

One solution, for large binary files that don't change much, is to store them in a different referential (like a Nexus repository), and version only a text file which declares which version you need.
Using an "artifact repository" is easier than storing binary elements in a source repo (made for comparing versions and merging between branches, which isn't of much use for said binaries).

The other solution, more git-centric, is git-annex:

git-annex allows managing files with git, without checking the file contents into git.
While that may seem paradoxical, it is useful when dealing with files larger than git can currently easily handle, whether due to limitations in memory, time, or disk space.

It is however not compatible with Windows.

A more generic solution could be git-media, which also allows you to use Git with large media files without storing the media in Git itself.

Finally, the easiest solution is to isolate those binaries in their own git submodule as you mention in your question: it isn't very satisfactory, and the initial clone will still take times, but the next updates for the parent repo will be short.

这篇关于如何处理一个大的git仓库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆