如何压缩Git存储库到单个提交并摧毁其他所有内容? [英] How to squash a Git repository to a single commit and destroy everything else?

查看:126
本文介绍了如何压缩Git存储库到单个提交并摧毁其他所有内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我发现了几个建议,其中包括:

我想将整个Git存储库压缩到一个提交,并实际删除所有其他提交。 p>

  $ git reset --soft< root-commit> 

这适用于挤压,但如果您知道它们,仍可以检出以前的提交ID。我怎样才能摆脱它们呢?



也许最简单的解决方案是删除 .git 目录,并再次运行 git init ,不是吗?如果我重新添加源,然后使用 git push --force ,我甚至可以保留相同的GitHub存储库,对吗?



PS:在这个问题我已经澄清了我的实际上是想实现。

解决方案

更新 - cantSleepNow的评论让我想起了一些对我的警告回答。


  • 您想知道未跟踪文件的状态,特别是在重新编译存储库时。这意味着什么取决于你如何使用你的工作树,以及你的忽略规则是如何设置的。

  • 你还可能有特定于存储库的配置考虑。


未追踪文件

我通常将工作状态保持在干净状态,这意味着 git status 大多数情况下不应报告任何未跟踪的内容。此外,我尝试使用 .gitignore 作为我的忽略规则,理想情况下应该少于几个(输出目录的基于目录的规则,IDE生成的基于模式的规则文件可能会散布在整个工作树中......)

如果你遵循这些相同的做法,那么你通常不应该对未跟踪文件做任何特别的事情;当您启动新的回购时,您的忽略模式将仍然存在。但是,如果您以前曾提交过符合您的忽略规则的文件(并且如果这是故意的,以至于您仍然希望它们),那么您必须强制将它们添加到新的repo中(或者删除忽略规则,如果您在 .git / info / exlcude 中包含本地忽略规则,那么您可以添加它们,然后重新添加忽略规则。 c $ c>,那么当你删除 .git 时(除非你备份它们),这些当然会消失。



如果您保留的未跟踪文件不在您的忽略规则中,则必须确保您不会意外地将> >添加到新回购库中。 (我鼓励你对未来的人使用忽略规则。)一种解决方案,如果你知道你不需要任何未跟踪文件的内容,就是使用 git clean
$ b

回购配置



您的 .git 目录可以包含诸如特定于回购的配置设置,钩子脚本,本地排除规则(在上面提到),LFS配置(和对象内容)等...



如果你对git的使用很简单,你可能没有任何这些东西。如果你做了特定于回购的任何事情(而不是签入/源代码控制),那么它可能存储在 .git 下,你需要检查是否备份它。如果您不确定,那么您可能需要使用不同的方法来安全地清理回购库(所以我会在下面提供一个)。



回到你的选择...

...

原来我建议最简单的事情,如果你想确定历史 ,是

  rm -rf .git 
git init
git add。
git commit

任何其他过程大部分 /更容易出错的方式来模仿这个结果。但是如果你确定了一些你想要从 .git 保留的东西,比如钩子或本地配置,你可能会有额外的步骤。如果您不确定是否还需要 .git 中的任何内容,那么您需要一种方法来删除不需要的内容。



要清理内容的回购:

首先,确保您有用于检查新单一提交的工作树进入你的工作树。



现在,如果您不在 master 上,请继续并

  git branch -f master 
git checkout master

然后删除所有的裁判。您可以使用git命令来执行此操作(并且在某些情况下更安全),但是如果您知道要全部清除它们,最简单的方法是使用

  rm .git / packed-refs 
rm -rf .git / refs / *

这会混淆git,但它会让你处于索引和工作树不变的状态(仍旧是你的旧主状态),但是没有认可的父提交,所以一切都是新添加的文件。

  git commit 

你应该提交一个没有历史记录的新提交,并且 master 应该指向它。



现在您需要摆脱reflog,因为它仍然可以达到旧的提交。再次您可以使用git命令,但是我已经获得了最好的运气。

  rm -rf .git / logs 

现在您可以使用

  git gc --aggressive --prune = now 

并验证旧的提交不再被发现。



这对您的本地回购很好,但github ...

您已经表达了保留现有回购的愿望,但您也注意到您不希望某人能够获得旧的提交即使他们知道SHA1



强制推送会覆盖当前分支上游的ref(可能 master ,因为你没有另外指定)。它不会影响其他ref(分支,标签),如果有的话,它不会影响其他提交。



要移除提交,您需要(1)成为确保没有任何东西可以到达它们,(2)运行 git gc 。来自github支持的tweet说:


我们最多一次运行 git gc 一天,由推动自动触发。


所以看起来你没有太多的控制权。强制推动可能触发gc,并且gc 可能会清除旧的提交,但是您必须测试它是否确实做到了(清除浏览器缓存,尝试访问其中一个应该去掉的提交)。



与本地仓库一样,如果这很重要,那么删除仓库和创建仓库可能更容易和更安全新的。


I would like to squash an entire Git repository down to a single commit, and actually remove all other commits.

I have found several suggestions, including:

$ git reset --soft <root-commit>

This works with respect to the squashing, but it's still possible to checkout the previous commits if you know their id. How can I get rid off them as well?

Maybe the simplest solution would be to delete the .git directory, and run git init again, wouldn't it? If I re-add the origin, and then use git push --force, I could even keep the same GitHub repository, right?

PS: In this question I have clarified what I actually want to achieve.

解决方案

UPDATE - cantSleepNow's comment got me thinking about a couple caveats to my answer.

  • You want to be aware of the state of untracked files, especially if you do rebuild the repository. What exactly that means depends on how you use your work tree, and on how your ignore rules are set up.

  • You also may have repository-specific configuration to consider.

Untracked Files:

I generally keep my worktree in a "clean" state, meaning that git status should not report anything untracked most of the time. Further, I try to use .gitignore for my ignore rules, which should ideally be few in number (directory-based rules for output directories, pattern-based rules for IDE-generated files that might be sprinkled throughout the work tree...)

If you follow those same practices, then you usually shouldn't have to do anything special about untracked files; your ignore patterns will still be there when you init the new repo. However, if you previously had committed files that would match your ignore rules (and if this is deliberate such that you still want them), then you'd have to force-add them to your new repo (or else remove the ignore rules, add them, and then re-add the ignore rules).

If you have local ignore rules in .git/info/exlcude, then of course those would go away when you delete .git (unless you back them up).

If you keep untracked files that aren't in your ignore rules, you'll have to make sure you don't accidentally add them to the new repo. (I would encourage you to use ignore rules for those going forward.) One solution, if you know you don't need the contents of any untracked files, is to use git clean to be rid of them.

Repo Configuration

Your .git directory can contain things like repo-specific config settings, hook scripts, local exclude rules (touched on above), LFS configuration (and object content), ...

If your usage of git is simple, you might not have any of these things. If you do anything that's repo-specific (and not checked in / source controlled), then it likely is stored under .git and you need to review whether to back it up. If you're not sure, then you may need to use a different method to safely clean the repo (so I'll provide one below).

So getting back to your options...

Originally I suggested that the simplest thing to do, if you want to be sure history is gone, is

rm -rf .git
git init
git add .
git commit

Any other procedure is mostly just a longer / more error-prone way to imitate this result. But you may have extra steps if you identified things you want to keep from .git, like hooks or local config. And if you aren't sure whether anything in .git might still be needed, then you need a way to just delete what you don't want.

To cleanse a repo of content:

First, make sure you have the work tree you want for your new single commit checked out into your work tree.

Now, if you aren't on master, go ahead and

git branch -f master
git checkout master

Then delete all of the refs. You can use git commands to do this (and in some circumstances that's safer), but the simplest way if you know you want to wipe them all out is

rm .git/packed-refs
rm -rf .git/refs/*

This will kind of confuse git, but it will leave you in a state where your index and work tree are unchanged (still your old master state), but there's no recognized parent commit, so everything is a newly added file.

git commit

You should git a new commit with no history, and master should point to it.

Now you need to get rid of the reflog, because it can still reach the old commits. Again you could use git commands, but I've had the best luck with

rm -rf .git/logs

And now you can get rid of the old commits with

git gc --aggressive --prune=now

and verify that old commits are no longer to be found.

That's fine for your local repo; but github...

You've expressed a desire to keep your existing repo, but you've also noted that you don't want someone to be able to get the old commits even if they know the SHA1.

A force push will overwrite the ref for the upstream of the current branch (probably master since you haven't specified otherwise). It will not affect other refs (branches, tags) if there are any, and it will not affect other commits.

To remove commits, you need (1) to be sure nothing (short of a direct SHA1 reference) can reach them, and (2) to run git gc. A tweet from github support says:

We run git gc at most once per day, triggered automatically by a push.

So it seems you don't have much control over that. The force push might trigger a gc, and that gc might clear away the old commits, but you'd have to test whether it really did (clear your browser cache, try to access one of the commits that should be gone).

As with the local repo, if this is important then it's probably easier and safer to delete the repo and create a new one.

这篇关于如何压缩Git存储库到单个提交并摧毁其他所有内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆