您能以任何方式在Git中获得重复的哈希吗,这意味着什么? [英] Can you get a duplicate hash in Git in any way and what are the implications

查看:71
本文介绍了您能以任何方式在Git中获得重复的哈希吗,这意味着什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的观点是,应该有可能获得重复的git哈希,因为哈希码是唯一性的简明表示,因此,将有一些步骤序列产生相同的哈希码.更重要的是,应该有一系列步骤,其中提交了不同的更改,但产生了相同的哈希码.

My opinion is that it should be possible to get a duplicate git hash because the hash code is a condensed representation of uniqueness thus there will be some sequence of steps that produces the same hash code. More importantly there should be a sequence of steps where different changes are committed yet produce the same hash code.

例如,在同一台计算机上两次克隆同一存储库,从而在不同的存储库中进行几乎相同的精确更改(节省一个字节或一位)并提交.即使在提交中使用了目录名或时间戳,也应该可以获取它(尽管很少见).例如,两个人在两个不同的机器上同时进行提交.

For example clone the same repository twice on the same machine, making almost the same exact change (save one byte or bit) in the different repositories and committing. Even if the directory name or timestamp is used in the commit, it should still be possible to get this (though granted rare). For example two separate people on two different machines making a commit at the same time.

我的问题有两个.这种情况如何发生,Git将如何处理.

My question is two fold. How can this happen and how will Git handle it.

或更明确地说,git如何确保您在推送之前保持最新状态.是否有可能一个人先推送,然后另一个人推送(都是基于同一父提交的更改),Git看到哈希码与远程和本地历史记录匹配,从而决定您是否可以继续使用,并允许您推而只不过少了你的变化吗?在这种情况下,我看到的内容更像是以下内容:

Or more explicitly how does git ensure you are up to date before a push. Is it possible that one person pushes first, then the other tries to push (both changes based off of the same parent commit) and Git sees that the hash codes match from the remote and local history, decides you are good to go, allows your push but you just lost one of your changes? In this situation i see it more like the following:

repo1 a-> b-> c1

repo1 a->b->c1

repo2 a-> b-> c1'-> c2

repo2 a->b->c1'->c2

假设c1,c1',c2都发生在两个回购都克隆到b之后, 现在repo1推送了,没问题 现在repo2尝试推动c1'和c2,而git确定c1'= c1,但实际上它们有所不同,git将c2推到c1的顶部以获得 a-> b-> c1-> c2 我们丢失了在c1'中所做的更改

say c1,c1',c2 all happen after both repos were cloned at b, now repo1 pushes, no problems now repo2 attempts to push c1' and c2 and git determines that c1' = c1 but in fact they differ, git pushes c2 ontop of c1 to get a->b->c1->c2 and we lost the change made in c1'

这可能吗?它怎么会发生,git会做什么?

Is this possible? how could it happen and what would git do?

推荐答案

关于与重复哈希相关的问题部分:

With regard to the part of your question that relates to duplicate hashes:

Git完全依赖于生成的哈希值的唯一性,据我所知,没有任何措施可以处理产生相同哈希值的不同数据blob.但是,发生哈希冲突的机会非常小,在实践中可以忽略不计.如果您仍然担心,请此Pro Git中的部分可能会让您休息一下:

Git relies completely on the uniqueness of the hashes that are generated, and as far as I know has no safeguards to handle different data blobs yielding the same hash value. However, the chances of a hash collision occurring are vanishingly small and in practice can be ignored. If you are still worried, this section from Pro Git may put your mind at rest:

在同一晚发生无关紧要的事件中,编程团队的每个成员都有被狼攻击和杀死的可能性更高.

A higher probability exists that every member of your programming team will be attacked and killed by wolves in unrelated incidents on the same night.

关于您问题的第二部分(会发生什么):

As for the second part of your question (what happens):

如果您确实提交了一个哈希值与存储库中先前对象相同的SHA-1值的对象,则Git将在您的Git数据库中看到该先前对象,并假定该对象已被写入.如果您尝试在某个时候再次检出该对象,则将始终获得第一个对象的数据.

If you do happen to commit an object that hashes to the same SHA-1 value as a previous object in your repository, Git will see the previous object already in your Git database and assume it was already written. If you try to check out that object again at some point, you’ll always get the data of the first object.

这篇关于您能以任何方式在Git中获得重复的哈希吗,这意味着什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆