如何理解的Kademlia节点运算的时间复杂度 [英] How to understand the time complexity of Kademlia node operation

查看:355
本文介绍了如何理解的Kademlia节点运算的时间复杂度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在学习的Kademlia网络通过读经典文章的 Kademlia的:基于XOR度量一个对等信息系统。我想了解其操作的复杂性,但仍无法找到答案。

在举证的 3素描的部分,本文给出了两个定义:

  1. 节点的深度(H):160 - i,其中i是最小的指数 非空桶
  2. 在节点节点Y的桶高x :桶的索引,其中x将插入Ÿ减去X的指数至少显著空桶

和三个结论:

  1. 以压倒性的概率任意给定节点的高度将是恒定的日志N 作为具有n个节点的系统。
  2. 之内
  3. 最近的节点的桶高度在第k-最近节点的ID可能是 1o9氏恒定的范围内。
  4. 如果没有此节点之H 最显著K-桶为空,则查找过程会发现一个节点的一半,接近(或者更确切地说,其距离为1位短)的每一个步骤,从而调高节点 H - 1o9氏步骤

所以,我的问题是:

  1. 什么是至少显著空桶最显著K-桶
  2. 如何解释深度桶的高度以可视化的方式?
  3. 如何理解第二和第三个结论,说,为什么 1o9氏 2 H 4 - 1o9氏
解决方案

这已经有一段时间,因为我其实看报纸,所以我大多是拼凑了一起从我的实施经验,而不是试图匹配的概念我有我的头在纸张上的正式定义,因此采取与盐少许粮食以下


  

什么是至少显著空斗和最显著K-桶?

这基本上指的是桶排序XOR距离相对于节点自己的ID

  

如何解释的深度和铲斗高度可视化的方式?

每个桶覆盖密钥空间区域。例如。从0x0000 简化为2个字节到0x0FFF的密钥空间的1/16。这可能是pssed在CIDR般的面具,为0x0 / 4(4 preFIX位)EX $ P $。 这是一个桶的或多或少的深度。

有多种方式来组织一个路由表。 正确的方法是基于一个水桶psented最低ID重新$ P $重新present它作为树或排序列表。 这种方法允许对任意铲斗分裂的操作,其中的路由表的优化要求,也可用于实现多重导引节点

一个简化的实施可以代替使用固定长度的数组,并把每个桶的共享preFIX位置位与该节点自己的ID。即阵列中的位置0将具有0共享preFIX比特,它是最遥远的桶,所述桶覆盖50%的密钥空间的和铲斗这里最显著位是节点自己的ID的倒MSB。

在这种情况下,深度是简单地在阵列的位置。

  

如何理解第二和第三个结论,说,为什么登录K和H 4 - 1o9氏

假设你正在寻找一个ID,它是最远的离你自己的节点的ID。然后,你将只有一个水桶覆盖密钥空间的那一部分,即它会覆盖一半的密钥空间与你最显著位不同。 所以你问从水桶一个(或几个)节点。凭借具有共同的第一比特与你的查找目标的ID位的它们或多或少保证具有分裂,在两个或两个以上,即具有至少两倍的密钥空间覆盖范围的目标空间。因此,他们可以提供至少1位更好的信息。

当你查询到的目标更近的节点,他们也将有更好的密钥空间覆盖目标区域附近,因为这也更贴近他们自己的节点ID。

冲洗,重复,直到没有更接近的节点被发现。

由于每一跳刮胡子了至少1个身位的距离,在一个时间,你基本上需要一个为O(log(n))的跳数,其中n是网络规模。由于网络规模基本上决定了需要为您的家庭桶的节点,因此桶深之间的平均距离。

如果目标关键是更接近你自己的ID,你将需要更少的步骤,因为你将有更好的覆盖密钥空间的那个区域。

由于 K 的是一个常数(节点,每个桶),所以是的 1o9氏的。双倍节点数量水桶,它会具有给定的密钥空间区域的分辨率的两倍,从而将(概率)产生一个节点,一位接近目标比K / 2的大小水桶。即你必须每桶条目的数量增加一倍为您希望保存每一跳每增加一位。


编辑:这是一个真正的单归属BitTorrent的DHT的路由表,而按照prefixes的可视化,即不是相对于本地节点ID:

 节点ID:2A631C8E 7655EF7B C6E99D8A 3BF810E2 1428BFD4
桶:23 /条目:173
000 ...输入:8替换:8
00100 ...条目:8替换:0
0010100 ...条目:8替换:2
0010101000 ...条目:8替换:4
00101010010 ...条目:8替换:7
001010100110000 ...条目:8替换:3
0010101001100010 ...条目:8替换:3
00101010011000110000 ...条目:8替换:1
001010100110001100010 ...条目:3替换:0
0010101001100011000110 ...条目:6的替代品:0
0010101001100011000111 ...条目:6的替代品:0
0010101001100011001 ...条目:8替换:2
001010100110001101 ...条目:8替换:1
00101010011000111 ...条目:8替换:2
00101010011001 ...条目:7替换:0
0010101001101 ...条目:8替换:0
001010100111 ...条目:8替换:0
001010101 ...条目:8替换:1
00101011 ...条目:7替换:0
001011 ...条目:8替换:0
0011 ...条目:8替换:8
01 ...条目:8替换:8
1 ...条目:8替换:8
 

I'm now learning Kademlia network by reading the classical paper Kademlia: A Peer-to-peer Information System Based on the XOR Metric. I want to understand the complexity of its operation but still cannot figure it out.

In the 3 Sketch of proof section, the paper gives two definitions:

  1. Depth of a node (h): 160 − i, where i is the smallest index of a non-empty bucket
  2. Node y’s bucket height in node x: the index of the bucket into which x would insert y minus the index of x’s least significant empty bucket.

And three conclusions:

  1. With overwhelming probability the height of a any given node will be within a constant of log n for a system with n nodes.
  2. The bucket height of the closest node to an ID in the kth-closest node will likely be within a constant of log k.
  3. If none of this node’s h most significant k-buckets is empty, the lookup procedure will find a node half as close (or rather whose distance is one bit shorter) in each step, and thus turn up the node in h − log k steps.

So my questions are:

  1. What is "least significant empty bucket" and "most significant k-buckets"?
  2. How to explain the depth and bucket height in visual way?
  3. How to understand the second and third conclusions, say, why log k and h - log k?

解决方案

It has been a while since I've actually read the paper, so I'm mostly piecing this together from my implementation experience instead of trying to match the concepts that I have in my head to the formal definitions in the paper, so take the following with a little grain of salt


What is "least significant empty bucket" and "most significant k-buckets"?

That basically refers to the buckets sorted by XOR distance relative to the node's own ID

How to explain the depth and bucket height in visual way?

Each bucket covers a keyspace region. E.g. from 0x0000simplified to 2 bytes to 0x0FFF for 1/16th of the keyspace. This can be expressed in CIDR-like masks, 0x0/4 (4 prefix bits). That's more or less the depth of a bucket.

There are several ways to organize a routing table. The "correct" way is to represent it as tree or sorted list based on the lowest ID represented by a bucket. This approach allows for arbitrary bucket split operations as some routing table optimizations call for and can also be used to implement node multihoming.

A simplified implementation may instead use a fixed-length array and put each bucket at the position of shared prefix bits relative to the node's own ID. I.e. position 0 in the array will have 0 shared prefix bits, it's the most-distant bucket, the bucket covering 50% of the keyspace and the bucket where the most significant bit is the inverted MSB of the node's own ID.

In that case the depth is simply the array position.

How to understand the second and third conclusions, say, why log k and h - log k?

Say you are looking for an ID that is the furthest away from your own node's ID. Then you will only have one bucket covering that part of the keyspace, namely it will cover half the keyspace with the most significant bit differing from yours. So you ask one (or several) nodes from that bucket. By virtue of their ID bits having the first bit in common with your lookup target they are more or less guaranteed to have split that in two or more, i.e. have at least double the keyspace coverage for the target space. So they can provide at least 1 bit better information.

As you query closer nodes to the target they will also have better keyspace coverage near the target region because that's also closer to their own node ID.

Rinse, repeat until there are no closer nodes to be found.

Since each hop shaves off at least 1 bit of distance at a time you basically need a O(log(n)) hop count where n is the network size. Since network size basically dictates the average distance between nodes and thus bucket depth needed for your home bucket.

If the target key is closer to your own ID you will need fewer steps since you will have better coverage of that region of the keyspace.

Since k is a constant (the nodes-per-bucket) so is log k. Double the number of nodes in a bucket and it'll have twice the resolution of the given keyspace region and thus will (probabilistically) yield a node that is one bit closer to the target than a bucket with k/2 size. I.e. you have to double the number of entries per bucket for each additional bit per hop you wish to save.


Edit: Here's a visualization of an actual single-homed bittorrent DHT routing table, sorted by their prefixes, i.e. not relative to the local node ID:

Node ID: 2A631C8E 7655EF7B C6E99D8A 3BF810E2 1428BFD4
buckets: 23 / entries: 173
000...   entries:8 replacements:8
00100...   entries:8 replacements:0
0010100...   entries:8 replacements:2
0010101000...   entries:8 replacements:4
00101010010...   entries:8 replacements:7
001010100110000...   entries:8 replacements:3
0010101001100010...   entries:8 replacements:3
00101010011000110000...   entries:8 replacements:1
001010100110001100010...   entries:3 replacements:0
0010101001100011000110...   entries:6 replacements:0
0010101001100011000111...   entries:6 replacements:0
0010101001100011001...   entries:8 replacements:2
001010100110001101...   entries:8 replacements:1
00101010011000111...   entries:8 replacements:2
00101010011001...   entries:7 replacements:0
0010101001101...   entries:8 replacements:0
001010100111...   entries:8 replacements:0
001010101...   entries:8 replacements:1
00101011...   entries:7 replacements:0
001011...   entries:8 replacements:0
0011...   entries:8 replacements:8
01...   entries:8 replacements:8
1...   entries:8 replacements:8

这篇关于如何理解的Kademlia节点运算的时间复杂度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆