版本控制系统是否使用差异存储二进制文件? [英] Do version control systems use diffs to store binary files?

查看:60
本文介绍了版本控制系统是否使用差异存储二进制文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

流行的版本控制系统(svn,git)如何处理将修订存储到二进制文档?我有带有二进制源的项目,这些项目会定期更新并且需要检入(主要是Photoshop文档,自定义数据格式和一些文字处理文档).我一直担心签入二进制文件,因为我认为VCS可能会采用简单的方法,每次仅上传二进制文件的新副本-因此我的存储库会迅速变得庞大.

How do popular version control systems (svn, git) handle storing revisions to a binary document? I have projects with binary sources that are periodically updated and need to be checked in (mostly Photoshop documents, custom data format and a few word processing documents). I've always been worried about checking in the binaries because I thought that the VCS might take a simple route of simply uploading a new copy of the binary each time - and hence my repository would get huge quickly.

如果我有几个数据块(我们称它们为A,B,C,D等),并且我有一个二进制文件,则在第一次签入时看起来像ABC,但是在第二次签入时已修改为ADBE,我的VCS是否足够聪明以仅存储更改的位,还是会创建文件的全新映像?

If I have several data blocks (let's call them A, B, C, D, etc) and I have a binary file that on first check in looks like ABC, but then on the second check in has been modified to ADBE, will my VCS be smart enough to only store the changed bits or will it create an entirely new image of the file?

推荐答案

tl; dr

Git只能存储二进制文件的差异,但是效率不高,因此您可能应该使用一些外部工具,例如默认情况下,git在提交之间不存储差异.当您更改某些文件并进行新提交时,git会存储整个文件内容的 object .只需更改一行或重写整个文件都没关系-git至少在第一位不存储差异.有一段叫做 git-gc 的git(垃圾收集器)负责删除悬挂提交和优化之类的任务,它运行另一个git命令-

By default, git doesn't store diffs between commits. When you change some file and make a new commit, git stores object with a content of the whole file. It doesn't matter if you change just one line, or rewrite whole file - git doesn't store diffs, at least at first place. There is a piece of git called git-gc (garbage collector) responsible for tasks such removing dangling commits and optimization, it runs another git command - git-repack which does exactly what you ask for. It takes the whole bunch of objects and stores them inside one pack using delta compression.

不幸的是,在压缩二进制文件时,使用 git-repack 打包不是特别有效.您始终可以对其进行调整,但是如果您的文件更改了很多,或者如果它们真的很大,您可能应该使用一些外部工具,例如 lfs .

Unfortunately packing with git-repack is not especially efficient when comes to compressing binary files. You can always tweak it, but if your files change a lot, or if they are really big, you should probably use some external tool like lfs.

这篇关于版本控制系统是否使用差异存储二进制文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆