从远程删除未引用的对象 [英] Removing unreferenced objects from remote

查看:58
本文介绍了从远程删除未引用的对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道远程 git 存储库是否从本地接收到 push 后自动(或应该这样做)自动删除未引用的文件对象(也包括树),在重新设置本地基准并跳过一些引入了这些文件的提交后,这些提交也删除了这些文件.由于这些跳过的提交不再在提交的历史记录链中,因此逻辑上是远程删除这些对象,因为它们现在不属于历史记录中的任何提交.该图可以解释它:

I'm wondering if a remote git repo does (or should do) automatic delete of unreferenced file objects (and also trees) once it received a push from local, after rebasing local and skipping some commits that introduced those files and also these commits deleted those files. Since these skipped commits are no longer in the history chain of commits it's logical that remote delete these objects as they are now not part of any commit in the history. This graph may explain it:

这是 rebase --onto


 * b5b7c142 after-deleting offending-file
 * db759b06 deleted offending-file
 * 59a9440a added offending-file
 * 933729b1 before-adding-offending-file

在我后悔之前被推到了遥控器.但这是修复它的尝试...

which was pushed to the remote before I regret it. But here comes the attempt to fix it...

rebase-on 933729b1 db759b06

可有效地在删除违规文件后重新构建提交 b5b7c142

which effectively reconstructs commit b5b7c142 after-deleting offending-file

有一个不同的父级: 933729b1 before-adding-offending-file 并使中间的两个提交完全被忽略.

to have a different parent: 933729b1 before-adding-offending-file and leaving the middle two commits simply ignored.

这是上面的重新设置之后的外观:(请注意,因为我们更改了父级,因此第一次提交SHA1发生了更改

This is how it looks after the rebase above: (please note that first commit SHA1 changed because we changed parent)


* 17c95f49 after-deleting offending-file
| * db759b06 deleted offending-file
| * 59a9440a added offending-file
| /
* 933729b1 before-adding-offending-file

可以查找本地的历史记录,并且该文件对象仍然存在于.git/objects中,这是仍在此处的某些提交的一部分.现在,如果我现在推到遥控器会发生什么?它将删除github上 .git/objects 中的文件对象,因为它现在不属于任何提交/树吗?如果没有,我该怎么办?

and it's looking ok for a history on local and that file object still exists in .git/objects, it's a part of some commits that are here still. Now what happens if I pushed now to the remote? Will it delete that file object in .git/objects on github as it's now not part of any commit/tree? And if not, how can I do that?

推荐答案

GitHub将来可能会或可能不会删除无法到达的提交并提交文件.由他们决定.

GitHub may or may not delete the unreachable commit and file some time in the future. It's up to them.

在运行 git gc 时,通常的日常Git存储库(例如由您控制的Git存储库)通常会完全删除未引用的提交.为此,首先必须删除所有 all 引用.使用 git rebase 故意留下了一些引用:

A normal everyday Git repository—one you control, for instance—will generally drop the unreferenced commit entirely when git gc runs. For that to happen, though, first all references have to go away. Using git rebase leaves several references behind, on purpose:

  • HEAD reflog中有一个条目(可通过 git reflog 查看).
  • 分支reflog中有一个条目(可通过 git reflog 分支 查看).
  • ORIG_HEAD 中有一个引用.
  • There is an entry in the HEAD reflog (viewable with git reflog).
  • There is an entry in the branch reflog (viewable with git reflog branch).
  • There is a reference in ORIG_HEAD.

最后一个将被下一个操作覆盖,该操作将先前的 HEAD 值保存在 ORIG_HEAD 中.其他两个最终将由于reflog条目到期而被丢弃.每个引用日志条目都带有时间戳,并且在日志中是实时的".直到 current 时间大于添加到条目时间戳中的过期时间. git gc 的另一个功能是检查过期条目,它将删除该条目.到期时间在您的控制范围内,默认为30天和90天.这部分令人困惑(怎么可能都是 ?),但与 GitHub 变体并没有真正的关系,因为它们不使用这样的reflog:重点是引用必须真正消失,这需要时间,并且这对GitHub也适用.

The last one will be overwritten with the next operation that saves the previous HEAD value in ORIG_HEAD. The other two will eventually be dropped due to reflog entry expiration. Each reflog entry is timestamped, and is "live" until the current time is more than the expiration time added to the entry's timestamp. Another of git gc's functions is to check for expired entries, which it will delete. The expiration time is under your control, and is both 30 days and 90 days by default. This part is confusing (how can it be both?) but is not really relevant to the GitHub variant because they don't use the reflogs like this: the point is that the references have to be really gone, which takes time, and this part is true for GitHub as well.

一旦引用真正消失, git gc 将丢弃保存不需要的提交和文件的内部对象,提供不在保持包装中.保留包装是您必须自己创建的东西-Git本身不会这样做-因此,如果您不这样做,则您个人不会遇到此问题.

Once the references are really gone, a git gc would discard the internal objects that hold the unwanted commit and file, provided that they're not in a kept pack. Kept packs are something you have to create on your own—Git doesn't do this itself—so if you're not doing that, you personally won't encounter this.

GitHub的主要问题是,您不知道它们何时会清理其最后一个引用,也不知道它们何时随后会运行 git gc 这将丢弃对象- plus ,它们为拉取请求,问题和其他项目添加了特殊的引用,从而可以使对象无限期地存活.所有这些的结果是,您无法预测某些文件何时或什至是否会从GitHub消失.

The main issue you'll have with GitHub is that you don't know when they will scrub their last reference, nor when they will subsequently run a git gc that will discard the object—plus, they add special refs for pull requests, issues, and other items, which can keep objects alive indefinitely. The upshot of all of this is that you cannot predict when or even whether some file will disappear from GitHub.

请注意,您可以联系GitHub支持并让他们进行手动清理.当然,到那时,任何人都可以获取此文件,因此,如果其中包含任何敏感数据,请视为黑帽黑客社区目前已知道该文件.

Note that you can contact GitHub support and get them to do a manual scrub. Of course, by then, any number of people could have obtained this file, so if there's any sensitive data in it, consider it to be well-known to the black-hat hacker community by now.

这篇关于从远程删除未引用的对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆