比特:使用二叉索引树? [英] BIT:Using a binary indexed tree?

查看:19
本文介绍了比特:使用二叉索引树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

与其他数据结构相比,二叉索引树几乎没有或相对没有需要研究的理论.唯一简洁地教授它的地方是 topcoder 教程.虽然教程在所有的解释中都是完整的,但我无法理解这种树背后的直觉是什么?以及如何证明它的正确性?

A binary indexed tree has very few or relatively no theory to study as compared to other data structures. The only place where it is taught succinctly is the topcoder tutorial. Although the tutorial is complete in all the explanations, I cannot understand that what is the intuition behind such a tree? And how to prove it's correctness?

我认为这个证明很复杂,难以解释.那么在使用 BIT 时,您遵循什么方法?

I presume that the proof is to complex to explain. So when using BIT, what approach do you follow?

推荐答案

我发现了 这个答案 @templatetypedef 非常清楚地解释了二叉索引树的直觉和证明:答案....

I found this answer by @templatetypedef very clearly explain about the intuition and proof of a binary indexed tree: The answer....

直观地,您可以将二叉索引树视为二叉树的压缩表示,而二叉树本身就是对标准数组表示的优化.这个答案有一个可能的推导.

Intuitively, you can think of a binary indexed tree as a compressed representation of a binary tree that is itself an optimization of a standard array representation. This answer goes into one possible derivation.

例如,假设您要存储总共 7 个不同元素的累积频率.您可以首先写出数字将分配到的七个桶:

Let's suppose, for example, that you want to store cumulative frequencies for a total of 7 different elements. You could start off by writing out seven buckets into which the numbers will be distributed:

[   ] [   ] [   ] [   ] [   ] [   ] [   ]
  1     2     3     4     5     6     7

现在,让我们假设累积频率如下所示:

Now, let's suppose that the cumulative frequencies look something like this:

[ 5 ] [ 6 ] [14 ] [25 ] [77 ] [105] [105]
  1     2     3     4     5     6     7

使用此版本的数组,您可以通过增加存储在该位置的数字的值来增加任何元素的累积频率,然后增加之后出现的所有元素的频率.例如,要将 3 的累积频率增加 7,我们可以向数组中位置 3 或之后的每个元素添加 7,如下所示:

Using this version of the array, you can increment the cumulative frequency of any element by increasing the value of the number stored at that spot, then incrementing the frequencies of everything that come afterwards. For example, to increase the cumulative frequency of 3 by 7, we could add 7 to each element in the array at or after position 3, as shown here:

[ 5 ] [ 6 ] [21 ] [32 ] [84 ] [112] [112]
  1     2     3     4     5     6     7

这样做的问题是需要 O(n) 的时间,如果 n 很大,这会很慢.

The problem with this is that it takes O(n) time to do this, which is pretty slow if n is large.

我们可以考虑改进此操作的一种方法是更改​​存储在存储桶中的内容.您可以考虑仅存储当前频率相对于前一个存储桶增加的数量,而不是存储到给定点的累积频率.例如,在我们的例子中,我们将上面的bucket重写如下:

One way that we can think about improving this operation would be to change what we store in the buckets. Rather than storing the cumulative frequency up to the given point, you can instead think of just storing the amount that the current frequency has increased relative to the previous bucket. For example, in our case, we would rewrite the above buckets as follows:

Before:
[ 5 ] [ 6 ] [21 ] [32 ] [84 ] [112] [112]
  1     2     3     4     5     6     7

After:
[ +5] [ +1] [+15] [+11] [+52] [+28] [ +0]
  1     2     3     4     5     6     7

现在,我们可以通过向该桶添加适当的数量,在 O(1) 时间内增加该桶内的频率.但是,现在进行查找的总成本变为 O(n),因为我们必须通过将所有较小存储桶中的值相加来重新计算存储桶中的总成本.

Now, we can increment the frequency within a bucket in time O(1) by just adding the appropriate amount to that bucket. However, the total cost of doing a lookup now becomes O(n), since we have to recompute the total in the bucket by summing up the values in all smaller buckets.

我们需要从这里获得二叉索引树的第一个主要见解如下:与其不断地重新计算特定元素之前的数组元素的总和,如果我们要预先计算所有元素的总和会怎样?序列中特定点之前的元素?如果我们能做到这一点,那么我们只需将这些预先计算的总和的正确组合相加,就可以计算出某一点的累积总和.

The first major insight we need to get from here to a binary indexed tree is the following: rather than continuously recomputing the sum of the array elements that precede a particular element, what if we were to precompute the total sum of all the elements before specific points in the sequence? If we could do that, then we could figure out the cumulative sum at a point by just summing up the right combination of these precomputed sums.

实现此目的的一种方法是将表示从桶数组更改为节点的二叉树.每个节点都将使用一个值进行注释,该值表示该给定节点左侧的所有节点的累积总和.例如,假设我们从这些节点构造以下二叉树:

One way to do this is to change the representation from being an array of buckets to being a binary tree of nodes. Each node will be annotated with a value that represents the cumulative sum of all the nodes to the left of that given node. For example, suppose we construct the following binary tree from these nodes:

             4
          /     
         2       6
        /      / 
       1   3   5   7

现在,我们可以通过存储包括该节点及其左子树在内的所有值的累积总和来扩充每个节点.例如,给定我们的值,我们将存储以下内容:

Now, we can augment each node by storing the cumulative sum of all the values including that node and its left subtree. For example, given our values, we would store the following:

Before:
[ +5] [ +1] [+15] [+11] [+52] [+28] [ +0]
  1     2     3     4     5     6     7

After:
                 4
               [+32]
              /     
           2           6
         [ +6]       [+80]
         /          /   
        1     3     5     7
      [ +5] [+15] [+52] [ +0]

鉴于这种树结构,很容易确定一个点的累积总和.想法如下:我们维护一个计数器,最初为 0,然后进行正常的二进制搜索,直到找到有问题的节点.当我们这样做时,我们还会执行以下操作:每当我们向右移动时,我们也会将当前值添加到计数器中.

Given this tree structure, it's easy to determine the cumulative sum up to a point. The idea is the following: we maintain a counter, initially 0, then do a normal binary search up until we find the node in question. As we do so, we also the following: any time that we move right, we also add in the current value to the counter.

例如,假设我们要查找 3 的总和.为此,我们执行以下操作:

For example, suppose we want to look up the sum for 3. To do so, we do the following:

  • 从根 (4) 开始.计数器为 0.
  • 向左转至节点 (2).计数器为 0.
  • 向右转到节点 (3).计数器是 0 + 6 = 6.
  • 查找节点 (3).计数器是 6 + 15 = 21.

您也可以想象反向运行此过程:从给定节点开始,将计数器初始化为该节点的值,然后沿着树向上走到根.任何时候你沿着右子链接向上,在你到达的节点处添加值.例如,要找到 3 的频率,我们可以执行以下操作:

You could imagine also running this process in reverse: starting at a given node, initialize the counter to that node's value, then walk up the tree to the root. Any time you follow a right child link upward, add in the value at the node you arrive at. For example, to find the frequency for 3, we could do the following:

  • 从节点 (3) 开始.计数器是 15.
  • 向上到达节点 (2).计数器是 15 + 6 = 21.
  • 向上到达节点 (1).计数器是 21.

为了增加一个节点的频率(以及,隐式地,它之后的所有节点的频率),我们需要更新树中的节点集,包括该节点在其左子树中.为此,我们执行以下操作:增加该节点的频率,然后开始向上走到树的根部.每当您点击将您作为左孩子的链接时,通过添加当前值来增加您遇到的节点的频率.

To increment the frequency of a node (and, implicitly, the frequencies of all nodes that come after it), we need to update the set of nodes in the tree that include that node in its left subtree. To do this, we do the following: increment the frequency for that node, then start walking up to the root of the tree. Any time you follow a link that takes you up as a left child, increment the frequency of the node you encounter by adding in the current value.

例如,要将节点 1 的频率增加 5,我们将执行以下操作:

For example, to increment the frequency of node 1 by five, we would do the following:

                 4
               [+32]
              /     
           2           6
         [ +6]       [+80]
         /          /   
      > 1     3     5     7
      [ +5] [+15] [+52] [ +0]

从节点 1 开始,将其频率增加 5 得到

Starting at node 1, increment its frequency by 5 to get

                 4
               [+32]
              /     
           2           6
         [ +6]       [+80]
         /          /   
      > 1     3     5     7
      [+10] [+15] [+52] [ +0]

现在,转到其父级:

                 4
               [+32]
              /     
         > 2           6
         [ +6]       [+80]
         /          /   
        1     3     5     7
      [+10] [+15] [+52] [ +0]

我们沿着左子链接向上,所以我们也增加了这个节点的频率:

We followed a left child link upward, so we increment this node's frequency as well:

                 4
               [+32]
              /     
         > 2           6
         [+11]       [+80]
         /          /   
        1     3     5     7
      [+10] [+15] [+52] [ +0]

我们现在转到它的父级:

We now go to its parent:

               > 4
               [+32]
              /     
           2           6
         [+11]       [+80]
         /          /   
        1     3     5     7
      [+10] [+15] [+52] [ +0]

那是一个左子链接,所以我们也增加这个节点:

That was a left child link, so we increment this node as well:

                 4
               [+37]
              /     
           2           6
         [+11]       [+80]
         /          /   
        1     3     5     7
      [+10] [+15] [+52] [ +0]

现在我们完成了!

最后一步是将其转换为二叉索引树,这就是我们可以用二进制数做一些有趣的事情的地方.让我们用二进制重写这棵树中的每个桶索引:

The final step is to convert from this to a binary indexed tree, and this is where we get to do some fun things with binary numbers. Let's rewrite each bucket index in this tree in binary:

                100
               [+37]
              /     
          010         110
         [+11]       [+80]
         /          /   
       001   011   101   111
      [+10] [+15] [+52] [ +0]

在这里,我们可以进行非常非常酷的观察.取这些二进制数中的任何一个,找出在该数中设置的最后一个 1,然后删除该位以及它后面的所有位.您现在剩下以下内容:

Here, we can make a very, very cool observation. Take any of these binary numbers and find the very last 1 that was set in the number, then drop that bit off, along with all the bits that come after it. You are now left with the following:

              (empty)
               [+37]
              /     
           0           1
         [+11]       [+80]
         /          /   
        00   01     10   11
      [+10] [+15] [+52] [ +0]

这是一个非常非常酷的观察:如果你把 0 表示左"而 1 表示右",那么每个数字上的剩余位准确地说明如何从根开始然后向下走数字.例如,节点 5 的二进制模式为 101.最后一个 1 是最后一位,所以我们去掉它得到 10.事实上,如果你从根开始,向右 (1),然后向左 (0),你结束在节点 5!

Here is a really, really cool observation: if you treat 0 to mean "left" and 1 to mean "right," the remaining bits on each number spell out exactly how to start at the root and then walk down to that number. For example, node 5 has binary pattern 101. The last 1 is the final bit, so we drop that to get 10. Indeed, if you start at the root, go right (1), then go left (0), you end up at node 5!

这很重要的原因是我们的查找和更新操作取决于从节点返回到根的访问路径,以及我们是跟踪左子链接还是右子链接.例如,在查找过程中,我们只关心我们遵循的左侧链接.在更新期间,我们只关心我们遵循的正确链接.这种二叉索引树仅使用索引中的位就可以非常高效地完成所有这些工作.

The reason that this is significant is that our lookup and update operations depend on the access path from the node back up to the root and whether we're following left or right child links. For example, during a lookup, we just care about the left links we follow. During an update, we just care about the right links we follow. This binary indexed tree does all of this super efficiently by just using the bits in the index.

关键技巧是这个完美二叉树的以下特性:

The key trick is the following property of this perfect binary tree:

给定节点 n,返回到我们向右走的根的访问路径上的下一个节点是通过采用 n 的二进制表示并删除最后一个 1 来给出的.

Given node n, the next node on the access path back up to the root in which we go right is given by taking the binary representation of n and removing the last 1.

例如看一下节点7的访问路径是111.我们所取的到根的访问路径上的节点是向上跟随右指针的节点是

For example, take a look at the access path for node 7, which is 111. The nodes on the access path to the root that we take that involve following a right pointer upward is

  • 节点 7:111
  • 节点 6:110
  • 节点 4:100

所有这些都是正确的链接.如果我们取节点 3 的访问路径,即 011,查看我们向右走的节点,我们得到

All of these are right links. If we take the access path for node 3, which is 011, and look at the nodes where we go right, we get

  • 节点 3:011
  • 节点 2:010
  • (节点 4:100,在左侧链接之后)

这意味着我们可以非常非常有效地计算一个节点的累积总和,如下所示:

This means that we can very, very efficiently compute the cumulative sum up to a node as follows:

  • 以二进制写出节点 n.
  • 将计数器设置为 0.
  • 在 n ≠ 时重复以下操作0:
    • 在节点 n 处添加值.
    • 从 n 中删除最左边的 1 位.

    同样,让我们​​考虑如何进行更新步骤.为此,我们希望沿着访问路径返回到根,更新我们沿着左链接向上的所有节点.我们可以通过基本上执行上述算法来做到这一点,但将所有 1 转换为 0,将 0 转换为 1.

    Similarly, let's think about how we would do an update step. To do this, we would want to follow the access path back up to the root, updating all nodes where we followed a left link upward. We can do this by essentially doing the above algorithm, but switching all 1's to 0's and 0's to 1's.

    二叉索引树的最后一步是注意,由于这种按位技巧,我们甚至不再需要显式存储树.我们可以将所有节点存储在长度为 n 的数组中,然后使用按位旋转技术隐式导航树.事实上,这正是按位索引树所做的——它将节点存储在一个数组中,然后使用这些按位技巧有效地模拟在这棵树中向上行走.

    The final step in the binary indexed tree is to note that because of this bitwise trickery, we don't even need to have the tree stored explicitly anymore. We can just store all the nodes in an array of length n, then use the bitwise twiddling techniques to navigate the tree implicitly. In fact, that's exactly what the bitwise indexed tree does - it stores the nodes in an array, then uses these bitwise tricks to efficiently simulate walking upward in this tree.

    希望这有帮助!

    这篇关于比特:使用二叉索引树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆