是“blob”一个文件内容的同义词我放在一个git仓库? [英] Is "blob" a synonym for the contents of a file I put in a git repository?
问题描述
一位有见识的人回答了我的一个更复杂的git问题,似乎在这个问题的标题中有解释,我不同意这个问题,所以这是一种误解,甚至会让专家git用户感到困扰。
我将提供我自己的理解作为答案 - 我很乐意为您改进或批准(或两者兼得!) - 但随时可以添加得到良好支持的答案,这些答案与我的解释。
不,虽然这个术语的用法很矛盾很容易找到,但在一些重要的方面,blob而不是放置在git仓库中的文件内容的同义词。
其中一个blob对象包含(顺序地):
- 单词blob
- 一个空格
- 以空字符结尾的字符串表示您文件中的字节数
- 文件中的实际(逐字)数据
如果你的文件数据是blob,那么blob内容的定义将是递归的。其次,git存储blob的散列值,而不是文件的散列值。从 git(1) 手册页:
所有对象都由其内容的SHA-1散列命名,通常写为40个十六进制数字
如果blob对象和您的文件是同一个东西,它们将具有相同的散列。他们没有:
$ printfa>文件
$ openssl sha1文件
SHA1(文件)= 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8
$ git散列对象文件
2e65efe2a145dda7ee51d1741299f848e5bf752e
$ printfblob%d\000 $(cat文件)$(wc -c file | awk'{print $ 1}')> file-blob
$ openssl sha1 file-blob
SHA1(file-blob)= 2e65efe2a145dda7ee51d1741299f848e5bf752e
正如你所看到的,根据blob内容的上述定义构造的 file-blob
的SHA1匹配git存储的代表 file
,我们从 <强> GIT-哈希对象(1)强> 。第三,也许更迂腐,原因是因为把blob当作同义词可能会误导那些从其他上下文中知道blob的人,比如数据库,其中一个BLOB(二进制大对象)可能是你存储内容的逐字表示。
总结:虽然很多人使用blob作为仓库中文件内容的替身,按照git的说法,它不是一样。虽然blob对象表示您的文件并包含其数据(在ASCII头字符串之后),但任何相似之处纯属巧合。
A knowledgeable person who answered one of my more complicated git questions seemed to have the interpretation in this question's title, which I disagree with, so this is a misunderstanding that can even afflict expert git users.
I'll be offering my own understanding as an answer - which I'm happy for you to improve or approve (or both!) - but feel free to add well-supported answers that differ significantly from my interpretation.
No, though conflicting usages of the term are easy to find, in some important ways, "blob" is not a synonym for the contents of a file that has been placed in a git repository.
For one, a blob object contains (sequentially):
- the word "blob"
- one space
- a null-terminated string representing the number of bytes in your file
- the actual (verbatim) data from your file
If your file data was the blob, then this definition of a blob's contents would be recursive.
Secondly, git stores the hash of the blob, not the hash of your file. From the git(1) man page:
All objects are named by the SHA-1 hash of their contents, normally written as a string of 40 hex digits
If the blob object and your file were the same thing, they would have the same hash. They do not:
$ printf "a" > file
$ openssl sha1 file
SHA1(file)= 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8
$ git hash-object file
2e65efe2a145dda7ee51d1741299f848e5bf752e
$ printf "blob %d\000$(cat file)" $(wc -c file | awk '{print $1}') > file-blob
$ openssl sha1 file-blob
SHA1(file-blob)= 2e65efe2a145dda7ee51d1741299f848e5bf752e
As you can see, the SHA1 of file-blob
, constructed according to the above definition of a blob's contents, matches the hash that git stores to represent file
, which we obtained from git-hash-object(1).
A third, maybe more pedantic, reason is because treating the blob as a synonym can be misleading to those who know "blob" from other contexts, such as databases, where a "BLOB" (Binary Large OBject) might be a verbatim representation of what you're storing.
To conclude: Although many people use "blob" as a sort of stand-in for the contents of a file in a repository, in git's parlance, it is not the same thing. Although the blob object represents your file and contains its data (after an ASCII header string), any resemblance is purely coincidental.
这篇关于是“blob”一个文件内容的同义词我放在一个git仓库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!