编辑git [英] Redaction in git
问题描述
我最近开始为FTP创建一个小Python脚本。首先,我有一个FTP站点的服务器,登录名和密码的详细信息,但是这并不重要,因为我只是在本地开展工作。
<然后,我有了将这个项目放在github上的天才想法。我很快就意识到自己的错误,并用涉及
.netrc
的解决方案取代了硬连线细节。我现在已经从github中移除了项目,因为任何人都可以查看历史记录,并以纯文本形式查看登录详细信息。 问题是,有没有办法通过git的历史记录,并删除用户名和密码,但否则保持历史不变?或者是否需要启动一个没有历史记录的新回购? 首先,您应该更改FTP上的密码现场。密码已经公开;你不能保证没有人克隆回购站,或者它不是以某种备份的方式在纯文本中,或类似的东西。如果密码是有价值的,我会认为现在已经被破解。
现在,您可以了解如何编辑历史记录的问题。 git filter-branch
命令是为此目的而设计的;它将遍历存储库历史记录中的每个提交,应用一条命令对其进行修改,然后创建一个新的提交。
特别是,您希望 git filter-branch --tree-filter
。这允许您编辑每个提交的树的内容(实际文件和目录)。它将在包含整个树的目录中运行命令,您的命令可以编辑文件,添加新文件,删除文件,移动它们等等。然后,Git将创建一个新的提交对象,其中包含与上一个相同的元数据(提交消息,日期等),但树的修改由您的命令处理,将新文件视为增加,将文件丢失为删除, (所以,你的命令不需要做 git add
或 git rm
,它只需要修改树)。
出于您的目的,类似下面的内容应该可以正常工作,根据具体情况使用适当的正则表达式和文件名称:
git filter-branch --tree-filtersed -i -e's / SekrtPassWrd / REDACTED /'myscript.py - --all
请记住将其拷贝到您的存储库副本中,所以如果出现问题,您仍然会拥有原始并可以重新开始。 filter-branch
也会保存对原始分支的引用,例如 original / refs / heads / master
等等,所以即使你忘记这样做,你也应该能够恢复;当对我的源代码历史进行一些全局修改时,我喜欢确保我有多个后备以防出现问题。
解释如何更详细地工作:
sed -i -e's / SekrtPassWrd / REDACTED /'myscript.py
这将取代 myscript.py $
SekrtPassWrd
c $ c>文件,其中已编辑
; -i
选项设置为 sed
会指示它编辑文件,没有备份文件(如备份文件如果你需要做一些比单个替换更复杂的事情,你可以写一个脚本,然后调用它你的命令;只需确保以绝对路径名称调用它,如 git filter-branch
从临时目录中调用您的命令。
git filter-branch --tree-filter< command> - - all
这告诉 git
如上所述,通过存储库中的每个分支运行树过滤器。 - --all
部分告诉Git将其应用于所有分支;如果没有它,它只会编辑当前分支的历史记录,而将所有其他分支保持不变(这可能不是您想要的)。
请参阅文档在GitHub上删除敏感数据(如)获得更多关于处理推送给GitHub的信息副本的信息。请注意,他们重申我的建议来更改密码,并提供一些处理GitHub可能仍有的缓存副本的提示。
I started working on a little Python script for FTP recently. To start off with, I had server, login and password details for an FTP site hardwired in the script, but this didn't matter because I was only working on it locally.
I then had the genius idea of putting the project on github. I realised my mistake soon after, and replaced the hardwired details with a solution involving .netrc
. I've now removed the project from github, as anyone could look at the history, and see the login details in plain text.
The question is, is there any way to go through the git history and remove user name and password throughout, but otherwise leave the history intact? Or do I need to start a new repo with no history?
First of all, you should change the password on the FTP site. The password has already been made public; you can't guarantee that no one has cloned the repo, or it's not in plain-text in a backup somewhere, or something of the sort. If the password is at all valuable, I would consider it compromised by now.
Now, for your question about how to edit history. The git filter-branch
command is intended for this purpose; it will walk through each commit in your repository's history, apply a command to modify it, and then create a new commit.
In particular, you want git filter-branch --tree-filter
. This allows you to edit the contents of the tree (the actual files and directories) for each commit. It will run a command in a directory containing the entire tree, your command may edit files, add new files, delete files, move them, and so on. Git will then create a new commit object with all of the same metadata (commit message, date, and so on) as the previous one, but with the tree as modified by your command, treating new files as adds, missing files as deletes, etc (so, your command does not need to do git add
or git rm
, it just needs to modify the tree).
For your purposes, something like the following should work, with the appropriate regular expression and file name depending on your exact situation:
git filter-branch --tree-filter "sed -i -e 's/SekrtPassWrd/REDACTED/' myscript.py" -- --all
Remember to do this to a copy of your repository, so if something goes wrong, you will still have the original and can start over again. filter-branch
will also save references to your original branches, as original/refs/heads/master
and so on, so you should be able to recover even if you forget to do this; when doing some global modification to my source code history, I like to make sure I have multiple fallbacks in case something goes wrong.
To explain how this works in more detail:
sed -i -e 's/SekrtPassWrd/REDACTED/' myscript.py
This will replace SekrtPassWrd
in your myscript.py
file with REDACTED
; the -i
option to sed
tells it to edit the file in place, with no backup file (as that backup would be picked up by Git as a new file).
If you need to do something more complicated than a single substitution, you can write a script, and just invoke that for your command; just be sure to call it with an absolute pathname, as git filter-branch
call your command from within a temporary directory.
git filter-branch --tree-filter <command> -- --all
This tells git
to run a tree filter, as described above, over every branch in your repository. The -- --all
part tells Git to apply this to all branches; without it, it would only edit the history of the current branch, leaving all of the other branches unchanged (which probably isn't what you want).
See the documentation on GitHub on Removing Sensitive Data (as originally pointed out by MBO) for some more information about dealing with the copies of the information that have been pushed to GitHub. Note that they reiterate my advice to change your password, and provide some tips for dealing with cached copies that GitHub may still have.
这篇关于编辑git的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!