什么是裸存储库,为什么我需要一个? [英] What is a bare repository and why would I need one?

查看:23
本文介绍了什么是裸存储库,为什么我需要一个?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能已经得到了回答,但我没有找到好的答案.
我来自集中式存储库,例如 SVN,在那里您通常只执行签出、更新、提交、还原、合并等操作.

Git 快把我逼疯了.命令数不胜数,但最难理解的是为什么很多事情都是这样运作的.

根据"什么是裸git存储库?":

<块引用>

使用 git init --bare 创建的存储库称为裸存储库.它们的结构与工作目录略有不同.首先,它们不包含您的源文件的工作或检出副本.
……
使用 git init --bare 创建的裸仓库用于……共享....开发人员将克隆共享的裸存储库,在其存储库的本地工作副本中进行更改,然后推送回共享裸存储库以使其他用户可以使用他们的更改.
– Jon Saints,http://www.saintsjd.com/2011/01/what-is-a-bare-git-repository/

但是,从github存储库和git裸存储库之间有什么区别?"::><块引用>

GitHub 上的 Git 存储库是空的,就像您要推送到的任何远程存储库一样[原文如此].
– VonC,https://stackoverflow.com/a/20855207

然而,在 GitHub 中有源文件.我能看到他们.如果我创建一个裸仓库,则没有源文件,只有工作仓库的 .git 目录的内容.

这怎么可能?我有什么不明白的?

你能举个例子说明为什么我需要一个裸存储库以及它以这种方式工作的动机吗?

更新

Edward Thomson 的回答部分是我想知道的.不过,我会改写我的问题:

我发布的第一个链接状态("What是一个裸 git 存储库吗?):

<块引用>

它们 [裸存储库] 不包含您的源文件的工作或检出副本.

VonC 的回答:

<块引用>

GitHub 上的 Git 存储库是空的

两种说法都暗示

<块引用><块引用>

Github 没有工作副本.

爱德华汤姆森说:

<块引用>

当您浏览网页时,它会根据数据呈现网页 - 将数据直接从存储库中提取出来并输出到您的网络浏览器,而不是先将其写入文件服务器上的磁盘

不知何故,一个裸存储库必须包含所有数据和源代码.如果没有,那么渲染任何东西都不是不可能,因为我可以看到所有更新(提交)的源代码、所有分支(带有各自的源)、一个 repo 的整个日志等.

存储库的全部数据是否总是在 .git 目录(或裸存储库)中,以某种能够随时呈现所有文件的格式?这是裸仓库的原因,而工作副本只有给定时间的文件吗?

解决方案

存储库的全部数据是否始终位于 .git 目录(或裸存储库)中,以某种能够随时呈现所有文件的格式?

是的,这些文件及其完整历史记录存储在 .git/packed-refs.git/refs.git/objects.

当你克隆一个 repo(裸或非裸)时,你总是拥有 .git 文件夹(或一个带有 .git 的文件夹)裸仓库的扩展,按照命名约定)及其 Git 管理和控制文件.(见词汇表)

Git 可以使用 git unpack-objects 随时解压它需要的东西.

诀窍是:

从裸仓库中,您可以查询日志(git bare 仓库中的 git log 工作正常:不需要工作树),或者 列出裸存储库中的文件.
或者显示来自裸仓库的文件内容.
这就是 GitHub 可以呈现包含文件的页面而无需查看完整存储库的方式.

我不知道 GitHub 是否完全做到了这一点,因为存储库的绝对数量迫使 GitHub 工程团队做各种优化.
参见例如他们如何优化克隆/获取存储库.
使用 DGit,这些裸存储库实际上是跨多个服务器复制的.

<块引用>

这是裸仓库的原因,而工作副本只有给定时间的文件吗?

对于 GitHub 而言,维护工作树会在磁盘空间和更新(当每个用户请求不同的分支时)方面花费太多.最好从独特的裸仓库中提取呈现页面所需的内容.

通常(在 GitHub 约束之外),使用裸仓库进行推送,以避免出现 工作树与刚刚推送的内容同步.参见 "但是为什么我需要一个裸仓库?"举个具体的例子.

话虽如此:

但这对于 GitHub 来说是不可能的,它无法为它必须存储的每个存储库维护一个(或服务器)工作树.


文章使用裸 Git 存储库为我的 dotfile 获取版本控制 "来自 Greg Owen,最初是 报告aifusenno1补充:

<块引用>

裸仓库是指没有快照的 Git 仓库.
它只是存储历史.它也碰巧以稍微不同的方式存储历史记录(直接在项目根目录中),但这几乎没有那么重要.

<块引用>

裸仓库仍将存储您的文件(请记住,历史记录有足够的数据来在任何提交时重建文件的状态).
你甚至可以从一个裸仓库创建一个非裸仓库:如果你 git clone 一个裸仓库,Git 会自动为你在新仓库中创建一个快照(如果你想要一个裸仓库,使用git clone --bare).

格雷格补充说:

<块引用>

那么为什么我们要使用一个裸 Git 存储库?永久链接

<块引用>

我发现的关于裸存储库的几乎所有解释都提到它们用于集中存储您希望在多个用户之间共享的存储库.

参见 Git 存储库布局:

一个 .git 目录,它是一个裸存储库(即没有自己的工作树),通常用于通过推入它并从中获取来与其他人交换历史.

<块引用>

基本上,如果你想编写自己的 GitHub/GitLab/BitBucket,你的中心化服务会将每个 repo 存储为一个裸仓库.
但为什么?没有快照如何连接到共享?

<块引用>

答案是,如果与您的存储库交互的唯一服务是 Git,则不需要快照.
基本上,快照对人类和非 Git 工具都是一种方便,但 Git 只与历史交互.您的集中式 Git 托管服务只会通过 Git 命令与 repos 交互,那么为什么要一直物化快照呢?快照只会占用额外的空间,没有任何好处.

<块引用>

GitHub 会在您访问该页面时即时生成该快照,而不是将其永久存储在存储库中(这意味着 GitHub 只需要在您请求时生成快照,而不是每次有人推送时都保持更新)任何变化).

This maybe has been answered, but I didn't find a good answer.
I come from centralized repositories, such as SVN, where usually you only perform checkouts, updates, commits, reverts, merges and not much more.

Git is driving me crazy. There are tons of commands, but the most difficult to understand is why many things work as they do.

According to "What is a bare git repository?":

Repositories created with git init --bare are called bare repos. They are structured a bit differently from working directories. First off, they contain no working or checked out copy of your source files.

A bare repository created with git init --bare is for… sharing. …developers will clone the shared bare repo, make changes locally in their working copies of the repo, then push back to the shared bare repo to make their changes available to other users.
– Jon Saints, http://www.saintsjd.com/2011/01/what-is-a-bare-git-repository/

However, from the accepted answer to "what's the difference between github repository and git bare repository?":

Git repos on GitHub are bare, like any remote repo to which you want to push to [sic].
– VonC, https://stackoverflow.com/a/20855207

However, in GitHub there are source files. I can see them. If I create a bare repository, there are no source files, only the contents for .git directory of a working repository.

How is this possible? What don't I understand?

Can you give an example about why I would need a bare repository and its motivation to work that way?

UPDATE

Edward Thomson's answer is, in part, what I wanted to know. Nevertheless, I will rephrase my question:

First link I posted states("What is a bare git repository?"):

they [bare repositories] contain no working or checked out copy of your source files.

VonC's answer:

Git repos on GitHub are bare

Both statements implies

Github has no working copy.

Edward Thomson says:

it renders the web page based on the data as you navigate through it - pulling the data directly out of the repo and out to your web browser, not writing it to a disk on the fileserver first

Somehow, a bare repository has to contain all data and source code. If not, it wouldn't be impossible to render anything, because I can see all source code updated (commited), all branches (with their respective source), the whole log of a repo, etc.

Is there the whole data of a repository always within .git directory (or in a bare repo), in some kind of format which is able to render all files at any time? Is this the reason of bare repository, while working copy only has the files at a given time?

解决方案

Is there the whole data of a repository always within .git directory (or in a bare repo), in some kind of format which is able to render all files at any time?

Yes, those files and their complete history are stored in .git/packed-refs and .git/refs, and .git/objects.

When you clone a repo (bare or not), you always have the .git folder (or a folder with a .git extension for bare repo, by naming convention) with its Git administrative and control files. (see glossary)

Git can unpack at any time what it needs with git unpack-objects.

The trick is:

From a bare repo, you can query the logs (git log in a git bare repo works just fine: no need for a working tree), or list files in a bare repo.
Or show the content of a file from a bare repo.
That is how GitHub can render a page with files without having to checkout the full repo.

I don't know that GitHub does exactly that though, as the sheer number of repos forces GitHub engineering team to do all kind of optimization.
See for instance how they optimized cloning/fetching a repo.
With DGit, those bare repos are actually replicated across multiple servers.

Is this the reason of bare repository, while working copy only has the files at a given time?

For GitHub, maintaining a working tree would cost too much in disk space, and in update (when each user request a different branch). It is best to extract from the unique bare repo what you need to render a page.

In general (outside of GitHub constraint), a bare repo is used for pushing, in order to avoid having a working tree out of sync with what has just been pushed. See "but why do I need a bare repo?" for a concrete example.

That being said:

But that would not be possible for GitHub, which cannot maintain one (or server) working tree(s) for each repo it has to store.


The article "Using a bare Git repo to get version control for my dotfiles " from Greg Owen, originally reported by aifusenno1 adds:

A bare repository is a Git repository that does not have a snapshot.
It just stores the history. It also happens to store the history in a slightly different way (directly at the project root), but that’s not nearly as important.

A bare repository will still store your files (remember, the history has enough data to reconstruct the state of your files at any commit).
You can even create a non-bare repository from a bare repository: if you git clone a bare repository, Git will automatically create a snapshot for you in the new repository (if you want a bare repository, use git clone --bare).

And Greg adds:

So why would we use a bare Git repository?Permalink

Almost every explanation I found of bare repositories mentioned that they’re used for centralized storage of a repository that you want to share between multiple users.

See Git repository layout:

a <project>.git directory that is a bare repository (i.e. without its own working tree), that is typically used for exchanging histories with others by pushing into it and fetching from it.

Basically, if you wanted to write your own GitHub/GitLab/BitBucket, your centralized service would store each repo as a bare repository.
But why? How does not having a snapshot connect to sharing?

The answer is that there’s no need to have a snapshot if the only service that’s interacting with your repo is Git.
Basically, the snapshot is a convenience for humans and non-Git tools, but Git only interacts with the history. Your centralized Git hosting service will only interact with the repos through Git commands, so why bother materializing snapshots all the time? The snapshots only take up extra space for no gain.

GitHub generates that snapshot on the fly when you access that page, rather than storing it permanently with the repo (this means that GitHub only needs to generate a snapshot when you ask for it, rather than keeping one updated every time anybody pushes any changes).

这篇关于什么是裸存储库,为什么我需要一个?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆