创建仅包含本地存储库历史记录的一部分的GitHub存储库 [英] Creating a GitHub repository with only a subset of a local repository's history

查看:49
本文介绍了创建仅包含本地存储库历史记录的一部分的GitHub存储库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景:我离

The background: I'm moving closer to open sourcing a personal research code I've been working on for more than two years. It started life as an SVN repository, but I moved to Git about a year ago, and I'd like to share the code on GitHub. However, it accumulated a lot of cruft over the years, and I'd prefer that the public version begin its life at its current status. However, I'd still like to contribute to it and incorporate other people's potential contributions.

问题:有没有办法分叉" Git存储库,以使叉上没有保留任何历史记录(位于GitHub上),但是我的本地存储库仍具有完整的历史记录,我可以拉/推到GitHub吗?

The question: is there a way to "fork" a Git repository such that no history is retained on the fork (which lives on GitHub), but that my local repository still has a complete history, and I can pull/push to GitHub?

我在大型存储库的管理端没有任何经验,因此非常感谢您提供详细信息.

I don't have any experience in the administrating end of large repositories, so detail is very much appreciated.

推荐答案

您可以在Git中轻松创建新的新历史记录.假设您希望您的master分支成为要推送到GitHub的分支,并将完整的历史记录存储在old-master中.您可以将您的master分支移动到 ,然后使用 git checkout --orphan :

You can create a new, fresh history quite easily in Git. Let’s say you want your master branch to be the one that you will push to GitHub, and your full history to be stored in old-master. You can just move your master branch to old-master, and then start a fresh new branch with no history using git checkout --orphan:

git branch -m master old-master
git checkout --orphan master
git commit -m "Import clean version of my code"

现在您有了一个新的master分支,该分支没有历史记录,您可以将其推送到GitHub.但是,正如您所说,您希望能够查看本地存储库中的所有旧历史记录;并且可能希望它不断开连接.

Now you have a new master branch with no history, which you can push to GitHub. But, as you say, you would like to be able to see all of the old history in your local repository; and would probably like for it to not be disconnected.

您可以使用 git replace 进行此操作.替换引用是一种在Git查看给定提交时指定备用提交的方法.因此,在查看历史记录时,您可以告诉Git查看旧分支的最后一次提交,而不是新分支的第一次提交.为此,您需要引入旧存储库中断开连接的历史记录.

You can do this using git replace. A replacement ref is a way of specifying an alternate commit any time Git looks at a given commit. So you can tell Git to look at the last commit of your old branch, instead of the first commit of your new branch, when looking at history. In order to do this, you need to bring in the disconnected history from the old repository.

git replace master old-master

现在您有了新的分支,可以在其中查看所有历史记录,但是实际的提交对象已与旧的历史记录断开连接,因此您可以在不提交旧提交的情况下将新提交推送到GitHub.将您的master分支推送到GitHub,只有新的提交才会进入GitHub.但是,看看gitkgit log中的历史记录,就会看到完整的历史记录.

Now you have your new branch, in which you can see all of your history, but the actual commit objects are disconnected from the old history, and so you can push the new commits to GitHub without the old commits coming along. Push your master branch to GitHub, and only the new commits will go to GitHub. But take a look at the history in gitk or git log, and you'll see the full history.

git push github master:master
gitk --all

陷阱

如果您在旧提交的基础上建立了新分支,则必须小心保持历史记录分开;否则,这些分支上的新提交实际上将在其历史记录中包含旧提交,因此,如果将其推送到GitHub,则会将整个历史记录一起拉拢.不过,只要您根据新的master保留所有新提交,就可以了.

If you ever base any new branches on the old commits, you will have to be careful to keep the history separate; otherwise, new commits on those branches will really have the old commits in their history, and so you'll pull the whole history along if you push it up to GitHub. As long as you keep all of your new commits based on your new master, though, you'll be fine.

如果您曾经运行过git push --tags github,这将推送您的所有标签,包括旧标签,这将导致您所有的旧历史记录都随之被提取.您可以通过删除所有旧标签(git tag -d $(git tag -l))或不使用git push --tags而是仅手动推送标签或使用如下所述的两个存储库来解决此问题.

If you ever run git push --tags github, that will push all of your tags, including old ones, which will cause all of your old history to be pulled along with it. You could deal with this by deleting all of your old tags (git tag -d $(git tag -l)), or by never using git push --tags but only ever pushing tags manually, or by using two repositories as described below.

这两个陷阱背后的基本问题是,如果您推送任何连接到任何旧历史的引用(而不是通过替换的提交),您将推送所有旧历史.避免这种情况的最佳方法可能是使用两个存储库,一个存储库仅包含新提交,另一个存储库包含旧历史记录和新历史记录,以检查整个历史记录.您只需使用新的提交就可以在存储库中完成所有工作,提交,从GitHub进行推入和拉出;这样,您就不可能无意间将您的旧提交推高.

The basic problem underlying both of these gotchas is that if you ever push any ref which connects to any of the old history (other than via the replaced commits), you will push up all of the old history. Probably the best way of avoiding this is by using two repositories, one which contains only the new commits, and one which contains both the old and new history, for the purpose of inspecting the full history. You do all of your work, your committing, your pushing and pulling from GitHub, in the repository with just the new commits; that way, you can't possibly accidentally push your old commits up.

然后,每当需要查看整个内容时,便将所有新提交提交到具有完整历史记录的存储库中.您可以从GitHub或其他本地存储库中提取,以较方便的方式为准.这将是您的存档,但是为了避免意外发布您的旧历史记录,您永远不要将其从GitHub推送到GitHub.设置方法如下:

You then pull all of your new commits into your repository that has the full history, whenever you need to look at the entire thing. You can either pull from GitHub or your other local repository, whichever is more convenient. It will be your archive, but to avoid accidentally publishing your old history, you don't ever push to GitHub from it. Here's how you can set it up:


~$ mkdir newrepo
~$ cd newrepo
newrepo$ git init
newrepo$ git pull ~/oldrepo master
# Now newrepo has just the new history; we can set up oldrepo to pull from it
newrepo$ cd ~/oldrepo
oldrepo$ git remote add newrepo ~/newrepo
oldrepo$ git remote update
oldrepo$ git branch --set-upstream master newrepo/master
# ... do work in newrepo, commit, push to GitHub, etc.
# Now if we want to look at the full history in oldrepo:
oldrepo$ git pull

如果您使用的Git版本低于1.7.2

您没有git checkout --orphan,因此您必须手动进行操作,方法是从现有存储库的当前修订版创建一个新的存储库,然后获取旧的断开连接的历史记录.您可以使用以下方法来做到这一点,例如:

You don't have git checkout --orphan, so you'll have to do it manually by creating a fresh repository from the current revision of your existing repository, and then pulling in your old disconnected history. You can do this with, for example:


oldrepo$ mkdir ~/newrepo
oldrepo$ cp $(git ls-files) ~/newrepo
oldrepo$ cd ~/newrepo
newrepo$ git init
newrepo$ git add .
newrepo$ git commit -m "Import clean version of my code"
newrepo$ git fetch ~/oldrepo master:old-master

如果您使用的Git版本低于1.6.5

git replace和replace refs在1.6.5中添加,因此您必须使用一种较旧的,灵活性稍差的机制,称为

git replace and replace refs were added in 1.6.5, so you'll have to use an older, somewhat less flexible mechanism known as grafts, which allow you to specify alternate parents for a given commit. Instead of the git replace command, run:

echo $(git rev-parse master) $(git rev-parse old-master) >> .git/info/grafts

这将使它在本地看起来像master提交将old-master提交作为其父级,因此您将看到比git replace提交更多的提交.

This will make it look, locally, as if the master commit has the old-master commit as its parent, so you will see one more commit than you would with git replace.

这篇关于创建仅包含本地存储库历史记录的一部分的GitHub存储库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆