从git存储库中删除旧的提交信息以节省空间 [英] Remove old commit information from a git repository to save space
问题描述
我有一个存储库,用于存储一些大型的二进制文件(tifs,jpgs,pdfs),这些文件正在增长。还有相当数量的文件被创建,删除和重命名,我不关心个人提交历史记录。这个问题有点简单,因为我正在处理一个没有分支和没有标签的仓库。
我很好奇是否有一种简单的方法可以删除一些从系统的历史节省空间。
我发现了一个旧线程在git邮件列表上,但它并没有真正指定如何使用它(即$ drop是什么):
git filter-branch --parent-filtersed -e's / -p $ drop //'\
--tag-name-filter cat - - \
--all ^ $ drop
<我认为,你可以缩小你的历史记录:
决定历史上的哪些点,哟你想保持。 然后,在每个保留为pick后留下第一个,并标记其他人称为壁球。 然后,通过保存并退出编辑器来运行rebase。在每个保持点,消息编辑器将弹出一个组合的提交消息,从前面的选择到保持提交。然后,您可以保留最后一条消息,或者将这些消息结合起来记录原始历史记录,而不保留所有中间状态。 重新绑定后,中间文件数据仍然会在存储库中,但现在未被引用。 I have a repository for storing some large binary files (tifs, jpgs, pdfs) that is growing pretty large. There is also a fair amount of files that are created, removed, and renamed and I don't care about the individual commit history. This question is somewhat simplified because I'm dealing with a repository that has no branches and no tags. I'm curious if there's an easy way to remove some of the history from the system to save space. I found an old thread on the git mailing list but it doesn't really specify how to use this (i.e. what the $drop is):
I think, you can shrink your history following this answer: How to delete a specific revision of a github gist? Decide on which points in history, you want to keep. Then, leave the first after each "keep" as "pick" and mark the others as "squash". Then, run the rebase by saving and quitting the editor. At each "keep" point, the message editor will pop up for a combined commit message ranging from the previous "pick" up to the "keep" commit. You can then either just keep the last message or in fact combine those to document the original history without keeping all intermediate states. After that rebase, the intermediate file data will still be in the repository but now unreferenced. 这篇关于从git存储库中删除旧的提交信息以节省空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
选取< hash1> <提交消息>
选择< hash2> <提交消息>
选择< hash3> <提交消息> < - 保留
选择< hash4> <提交消息>
选择< hash5> <提交消息>
选择< hash6> <提交消息> < - 保持
选择< hash7> <提交消息>
选择< hash8> <提交消息>
选择< hash9> <提交消息>
选择< hash10> <提交消息> < - 保留
pick< hash1> <提交消息>
压扁< hash2> <提交消息>
squash< hash3> <提交消息> < - 保留
选择< hash4> <提交消息>
压扁< hash5> <提交消息>
压扁< hash6> <提交消息> < - 保持
选择< hash7> <提交消息>
压扁< hash8> <提交消息>
squash< hash9> <提交消息>
squash< hash10> <提交消息> < - 保留
git gc
现在确实可以清除您的数据。git filter-branch --parent-filter "sed -e 's/-p $drop//'" \
--tag-name-filter cat -- \
--all ^$drop
pick <hash1> <commit message>
pick <hash2> <commit message>
pick <hash3> <commit message> <- keep
pick <hash4> <commit message>
pick <hash5> <commit message>
pick <hash6> <commit message> <- keep
pick <hash7> <commit message>
pick <hash8> <commit message>
pick <hash9> <commit message>
pick <hash10> <commit message> <- keep
pick <hash1> <commit message>
squash <hash2> <commit message>
squash <hash3> <commit message> <- keep
pick <hash4> <commit message>
squash <hash5> <commit message>
squash <hash6> <commit message> <- keep
pick <hash7> <commit message>
squash <hash8> <commit message>
squash <hash9> <commit message>
squash <hash10> <commit message> <- keep
git gc
will now indeed get you rid of that data.