编辑git [英] Redaction in git

查看:84
本文介绍了编辑git的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近开始为FTP创建一个小Python脚本。首先,我有一个FTP站点的服务器,登录名和密码的详细信息,但是这并不重要,因为我只是在本地开展工作。



<然后,我有了将这个项目放在github上的天才想法。我很快就意识到自己的错误,并用涉及 .netrc 的解决方案取代了硬连线细节。我现在已经从github中移除了项目,因为任何人都可以查看历史记录,并以纯文本形式查看登录详细信息。



问题是,有没有办法通过git的历史记录,并删除用户名和密码,但否则保持历史不变?或者是否需要启动一个没有历史记录的新回购? 首先,您应该更改FTP上的密码现场。密码已经公开;你不能保证没有人克隆回购站,或者它不是以某种备份的方式在纯文本中,或类似的东西。如果密码是有价值的,我会认为现在已经被破解。



现在,您可以了解如何编辑历史记录的问题。 git filter-branch 命令是为此目的而设计的;它将遍历存储库历史记录中的每个提交,应用一条命令对其进行修改,然后创建一个新的提交。



特别是,您希望 git filter-branch --tree-filter 。这允许您编辑每个提交的树的内容(实际文件和目录)。它将在包含整个树的目录中运行命令,您的命令可以编辑文件,添加新文件,删除文件,移动它们等等。然后,Git将创建一个新的提交对象,其中包含与上一个相同的元数据(提交消息,日期等),但树的修改由您的命令处理,将新文件视为增加,将文件丢失为删除, (所以,你的命令不需要做 git add git rm ,它只需要修改树)。

出于您的目的,类似下面的内容应该可以正常工作,根据具体情况使用适当的正则表达式和文件名称:

  git filter-branch --tree-filtersed -i -e's / SekrtPassWrd / REDACTED /'myscript.py -  --all 

请记住将其拷贝到您的存储库副本中,所以如果出现问题,您仍然会拥有原始并可以重新开始。 filter-branch 也会保存对原始分支的引用,例如 original / refs / heads / master 等等,所以即使你忘记这样做,你也应该能够恢复;当对我的源代码历史进行一些全局修改时,我喜欢确保我有多个后备以防出现问题。



解释如何更详细地工作:

  sed -i -e's / SekrtPassWrd / REDACTED /'myscript.py 

这将取代 myscript.py SekrtPassWrd c $ c>文件,其中已编辑; -i 选项设置为 sed 会指示它编辑文件,没有备份文件(如备份文件如果你需要做一些比单个替换更复杂的事情,你可以写一个脚本,然后调用它你的命令;只需确保以绝对路径名称调用它,如 git filter-branch 从临时目录中调用您的命令。

  git filter-branch --tree-filter< command> -   -  all 

这告诉 git 如上所述,通过存储库中的每个分支运行树过滤器。 - --all 部分告诉Git将其应用于所有分支;如果没有它,它只会编辑当前分支的历史记录,而将所有其他分支保持不变(这可能不是您想要的)。



请参阅文档在GitHub上删除敏感数据(如)获得更多关于处理推送给GitHub的信息副本的信息。请注意,他们重申我的建议来更改密码,并提供一些处理GitHub可能仍有的缓存副本的提示。


I started working on a little Python script for FTP recently. To start off with, I had server, login and password details for an FTP site hardwired in the script, but this didn't matter because I was only working on it locally.

I then had the genius idea of putting the project on github. I realised my mistake soon after, and replaced the hardwired details with a solution involving .netrc. I've now removed the project from github, as anyone could look at the history, and see the login details in plain text.

The question is, is there any way to go through the git history and remove user name and password throughout, but otherwise leave the history intact? Or do I need to start a new repo with no history?

解决方案

First of all, you should change the password on the FTP site. The password has already been made public; you can't guarantee that no one has cloned the repo, or it's not in plain-text in a backup somewhere, or something of the sort. If the password is at all valuable, I would consider it compromised by now.

Now, for your question about how to edit history. The git filter-branch command is intended for this purpose; it will walk through each commit in your repository's history, apply a command to modify it, and then create a new commit.

In particular, you want git filter-branch --tree-filter. This allows you to edit the contents of the tree (the actual files and directories) for each commit. It will run a command in a directory containing the entire tree, your command may edit files, add new files, delete files, move them, and so on. Git will then create a new commit object with all of the same metadata (commit message, date, and so on) as the previous one, but with the tree as modified by your command, treating new files as adds, missing files as deletes, etc (so, your command does not need to do git add or git rm, it just needs to modify the tree).

For your purposes, something like the following should work, with the appropriate regular expression and file name depending on your exact situation:

git filter-branch --tree-filter "sed -i -e 's/SekrtPassWrd/REDACTED/' myscript.py" -- --all

Remember to do this to a copy of your repository, so if something goes wrong, you will still have the original and can start over again. filter-branch will also save references to your original branches, as original/refs/heads/master and so on, so you should be able to recover even if you forget to do this; when doing some global modification to my source code history, I like to make sure I have multiple fallbacks in case something goes wrong.

To explain how this works in more detail:

sed -i -e 's/SekrtPassWrd/REDACTED/' myscript.py

This will replace SekrtPassWrd in your myscript.py file with REDACTED; the -i option to sed tells it to edit the file in place, with no backup file (as that backup would be picked up by Git as a new file).

If you need to do something more complicated than a single substitution, you can write a script, and just invoke that for your command; just be sure to call it with an absolute pathname, as git filter-branch call your command from within a temporary directory.

git filter-branch --tree-filter <command> -- --all

This tells git to run a tree filter, as described above, over every branch in your repository. The -- --all part tells Git to apply this to all branches; without it, it would only edit the history of the current branch, leaving all of the other branches unchanged (which probably isn't what you want).

See the documentation on GitHub on Removing Sensitive Data (as originally pointed out by MBO) for some more information about dealing with the copies of the information that have been pushed to GitHub. Note that they reiterate my advice to change your password, and provide some tips for dealing with cached copies that GitHub may still have.

这篇关于编辑git的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆