在git存储库中移动大量大文件 [英] Moving large number of large files in git repository

查看:155
本文介绍了在git存储库中移动大量大文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的存储库有大量的大文件。
他们主要是数据(文本)。
有时,我需要将这些文件移动到另一个位置,因为重构或打包。



我使用 git mv 命令来重命名文件的路径,但它看起来效率不高,因为提交的大小(实际差异大小)非常大,与 rm , git add



是否有其他方法可以减少提交大小?
或者我应该将它们添加到 .gitignore 并上传为上游的zip文件?






感谢您的答复。

仅供参考,
系列命令将导致文件大小 bar

  git mv foo bar 
git commit -m修改
git cat-file -s HEAD:bar

我认为git做了 rm 添加
如果这个信息与实际大小没有关系,你会告诉我吗?

解决方案

您在Git存储库内移动文件而不更改内容,创建提交将仅存储新的元数据(aka树对象)来表示新的文件位置。
由于内容不变,Git不需要创建新的blob对象来存储文件内容。
所以提交大小应该是相当小的。

既然你说diff差异很大,我想有些文件内容会随着重定位而被修改。这可能是提交大小很大的原因。



在这两种情况下,都可以使用命令 git gc --prune --aggressive



编辑:

  git mv foo bar 
git commit -mmodify
git cat-file -s HEAD:bar

这些命令创建一个新的提交,但由于foo / bar文件内容没有改变,Git不会存储任何新的东西,新的文件名称。事实上,在你的例子中, git cat-file -s HEAD:foo 在重命名之前和 git cat-file -s HEAD:bar 之后会给你相同的结果,因为它的内容相同(在.git / objects中有相同的blob)。
我认为你错误地解释了git在内部做的事情。查看 Git objets 以获取更多解释。



请记住,git会跟踪内容,而不是文件。


My repository has large number of large files. They are mostly data (text). Sometimes, I need to move these files to another location due to refactoring or packaging.

I use git mv command to "rename" the path of the files, but it seems inefficient in that the size of the commit (the actual diff size) is very huge, same as rm, git add

Is there other ways to reduce the commit size? or should I just add them to .gitignore and upload as a zip file to upstream?


Thank you for the answers.

FYI, following series of commands will result the size of the file bar

git mv foo bar
git commit -m "modify"
git cat-file -s HEAD:bar

from which I thought git did rm and add. Would you tell me if this info is not related to the actual size or not?

解决方案

By design, if you move a file inside a Git repository without changing content, creating a commit will only store new metadata (a.k.a. tree objects) to represent new file location. Since content is unchanged, Git doesn't need to create new blob object to store file content. So "commit size" should be rather small.

Since you say that diff size is huge, I suppose that some file content is modified along with relocation. This would be a reason for "commit size" to be huge.

In both case, you can try to shrink .git directory size with the command git gc --prune --aggressive

EDIT :

git mv foo bar
git commit -m "modify"
git cat-file -s HEAD:bar

These commands create a new commit, but the since the foo/bar file content has not changed, Git won't store anything new but the new file name. In fact, in you example, git cat-file -s HEAD:foo before rename and git cat-file -s HEAD:bar after will give you the same result, since its the same content (same blob in .git/objects). I think you are mis-interpreting things that git does internally. Have a look to Git objets to get further explanations.

Remember that git tracks content, not files.

这篇关于在git存储库中移动大量大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆