为什么Kademlia如何构造其路由表? [英] Why does Kademlia structure its routing table how it does?

查看:105
本文介绍了为什么Kademlia如何构造其路由表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道Kademlia路由表由160个存储桶组成.

I understand that the Kademlia routing table is made up of 160 buckets.

根据节点的前缀长度(即本地节点密钥和节点的XOR中前导未设置位的数量),将节点放入存储桶0-159中.

Nodes are put into buckets 0-159, depending on their prefix length (which is the number of leading unset bits in the XOR of the local node key and the node).

为什么会这样,是否有任何性能上的好处(除了无法遍历160 * 20个节点以找到最接近的节点这一事实之外)?

Why is this so, is there any performance benefits involved (other than the fact that iterating through 160*20 nodes to find the closest is infeasible)?.

推荐答案

Kademlia使用2个节点ID的XOR来衡量它们之间的距离.路由表存储桶的概念是,节点对与其接近"的网络有详细的了解,而对节点ID的了解越少.

Kademlia uses the XOR of 2 node IDs as a measure of their distance apart. The idea with the routing table buckets is that a node has detailed knowledge of the network "close" to it and less knowledge the further you get from its ID.

要查找最接近特定键的一组节点,节点首先要从自己的路由表中查询所知道的最接近的节点.平均而言,这些查找的一半将落入存储桶0覆盖的地址空间中.但是该存储桶只允许包含20个节点,因此这些节点实际上是最接近的节点的可能性很小.但是,该存储桶中的每个节点将对地址空间的该部分有更详细的了解,并且很可能能够从其自己的路由表中提供更好的关闭节点列表,然后也可以查询这些列表,依此类推.

To find a group of nodes closest to a particular key, a node would first query the closest ones it knows about from its own routing table. On average, half of these lookups will fall into the address space covered by bucket 0. But that bucket is only allowed to contain 20 nodes, so there is little chance that these nodes are the actual closest. However, each node in that bucket will have more detailed knowledge of that part of the address space, and will likely be able to provide from its own routing table a better list of close nodes, which can then also be queried, and so on.

通过这种方式,迭代查找可以很快地在实际最接近的节点组上进行磨合.

In this way, an iterative lookup very quickly hones in on the actual closest group of nodes.

当你说

遍历160 * 20个节点以找到最接近的节点是不可行的

iterating through 160*20 nodes to find the closest is infeasible

我认为您的意思是实际上无法查询其中的每一个;因为遍历这样的列表以提取最接近的列表正是节点如何处理查找请求(RPC)的原因.

I take it you mean that actually querying each of them would be infeasible; since iterating through such a list to extract the closest ones is exactly how lookup requests (RPCs) are handled by the nodes.

请注意,在实际情况下,存储桶的数量几乎不可能达到160.例如,对于十亿个节点的网络,平均存储桶数将为27.

Note that in real-world scenarios, it would be very unlikely for the number of buckets to get anywhere near 160. For example, for a network of a billion nodes, the average bucket count would be 27.

关于在Kademlia原始论文中选择的实际值,我不知道为什么将存储桶大小指定为20.但是,我想160是从SHA1哈希的位大小中得出的.通常,哈希函数用于为要存储的给定值生成密钥.如果使用SHA1进行哈希冲突的风险可以忍受,那么就可以了.如果不是,则可以使用不同的哈希算法,例如SHA256最多可产生256个存储桶.

As to the actual values chosen in the original Kademlia paper, I don't know why the bucket size is specified as 20. However, I imagine that 160 is derived from the bit-size of a SHA1 hash. Normally a hash function is used to generate the key for a given value to be stored. If the risk of a hash-collision using SHA1 is tolerably low, then this is fine. If not, a different hash algorithm could be used, e.g. SHA256 would yield up to 256 buckets.

这篇关于为什么Kademlia如何构造其路由表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆