如何删除Git存储库中不在工作目录中的所有文件? [英] How to remove all files in a Git repository that are not in the working directory?

查看:50
本文介绍了如何删除Git存储库中不在工作目录中的所有文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在拆分最初驻留在单个Subversion存储库中的旧版应用程序套件.

I'm in the process of splitting up an old suite of applications which originally resided in a single Subversion repository.

我已经将其转换为Git存储库,并删除了我不需要的存储库,但是我想通过摆脱与已删除文件关联的历史数据来缩减存储库的大小(原始存储库将是进行维护以供参考,因此在新版本中不需要.)

I've converted it over to a Git repository and removed what I don't want, but I'd like to slim the repository down by getting rid of the historical data associated with the deleted files (the original repository will be maintained for reference purposes so it isn't needed in the new one).

理想情况下,我想做的是遍历整个存储库,并删除工作目录中不存在的所有文件或文件夹,以及与之关联的任何历史记录.这将给我留下HEAD的内容以及影响这些文件的提交历史.但是,我还没有遇到这样的方法(孤立的HEAD不能保存历史记录,所以没有帮助).

Ideally what I'd like to do is go through the entire repository and remove any files or folders not present in the working directory, along with any history associated with them. This would leave me with the contents of HEAD and a history of commits affecting those files. However, I haven't come across a way of doing this (orphaning HEAD doesn't help as it doesn't preserve the history).

这可能吗?我知道如何通过git-filter-branch从整个历史记录中删除单个文件或文件夹,但是这样做的文件和文件夹太多了,这是不切实际的方法...除非有一种方法可以对不在HEAD中的所有文件进行过滤?

Is this possible? I know how to remove a single file or folder from the entire history via git-filter-branch, but there's too many files and folders for this to be a practical approach... unless there's a way of filtering on all files not in HEAD?

推荐答案

以下是使用git filter-branch删除所有不需要的文件的方法:

Here's how you can use git filter-branch to get rid of all files that you don't want:

  1. 获取您不想在历史记录中出现的文件名列表,包括重命名时的旧名称和新名称.例如,将它们放在名为toberemoved.txt

  1. Get a list of the filenames that you don't want to appear in the history both the old names and the new names in case of renames. For example put them in a file called toberemoved.txt

像这样运行git filter-branch:

Run git filter-branch like this:

$ git filter-branch --tree-filter "rm -f `cat toberemoved.txt`" branch1 branch2 ...

这是git filter-branch的相关手册页:

Here's the relevant man page from git filter-branch:

   --tree-filter <command>
       This is the filter for rewriting the tree and its contents. The
       argument is evaluated in shell with the working directory set to
       the root of the checked out tree. The new tree is then used as-is
       (new files are auto-added, disappeared files are auto-removed -
       neither .gitignore files nor any other ignore rules HAVE ANY
       EFFECT!).

因此,只需确保要删除的文件列表都相对于检出树的根即可.

So just make sure that the list of files you want deleted are all relative to the root of the checked out tree.

更新:

要获取过去存在但不在当前工作目录中的文件列表,可以运行以下命令.请注意,您将需要做更多的工作来保留重命名文件的重命名之前的历史记录":

To get the list of the files that were present in the past but not in the current working directory you can run the following. Note that you'll have to do further effort to keep the "history before renaming" of renamed files:

$ git log --raw |awk '/^:/ { if (! printed[$6]) { print $6; printed[$6] = 1 }}'|while read f;do if [ ! -f $f ]; then echo Deleted: $f;fi;done

$ 6是在--raw模式的日志中显示的一次提交中受影响的文件的名称.

That $6 is the name of the file that were affected in a commit in shown in the --raw mode of log.

如果您想知道每次提交时每个文件发生了什么([D]删除,[R]被迷惑,[M]经过修饰等),请查看--diff-filter选项以git log记录.

See the --diff-filter option to git log if you want know what happened ([D]eleted, [R]enamed, [M]odified, and so on) to each file for every commit.

在重命名的情况下,也许其他人可以了解如何查找跟踪文件的先前名称.

Maybe others can chime in on how to find out the previous name of a tracked file in case of renames.

这篇关于如何删除Git存储库中不在工作目录中的所有文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆