优化sum(i,j)和update(i,value)以获取整数 [英] Optimize sum(i,j) and update(i,value) for an arryas of integers

查看:131
本文介绍了优化sum(i,j)和update(i,value)以获取整数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出大量整数,优化函数sum(i,j)和update(i,value),以使两个函数的取值都小于O(n).

Given a huge array of integers, optimize the functions sum(i,j) and update(i,value), so that both the functions take less than O(n).

更新

这是一个面试问题.我试过O(n)sum(i,j)和O(1)update(i,value).其他解决方案是将输入数组预处理为二维数组,以得出sum(i,j)的O(1)答案.但这使O(n)的更新函数成为可能.

Its an interview question. I have tried O(n) sum(i,j) and O(1) update(i, value). Other solution is preprocess the input array into 2-d array to give O(1) answer for sum(i,j). But that makes the update function of O(n).

例如,给定一个数组:

A[] = {4,5,4,3,6,8,9,1,23,34,45,56,67,8,9,898,64,34,67,67,56,...}

要定义的操作是sum(i,j)update(i,value).

  • sum(i,j)给出从索引ij的数字总和.
  • update(i, value)用给定的value更新索引i的值.
  • sum(i,j) gives sum of numbers from index i to j.
  • update(i, value) updates the value at index i with the given value.

最直接的答案是,sum(i,j)可以在O(n)时间内计算,并在O(1)时间内更新(i,value).

The very straight answer is that sum(i,j) can be calculated in O(n) time and update(i,value) in O(1) time.

第二种方法是预先计算sum(i,j)并将其存储在二维数组SUM[n][n]中,并在查询时在O(1)时间给出答案.但是随后更新功能update(i,value)变为O(n)阶,因为必须更新与索引i相对应的整个行/列.

The second approach is that precompute the sum(i,j) and store it in a 2-d array SUM[n][n] and when queried for, give the answer in O(1) time. But then the update function update(i,value) becomes of order O(n) as an entire row/column corresponding to index i has to be updated.

面试官给了我暗示要进行一些预处理并使用一些数据结构,但是我没想到.

The interviewer gave me hint to do some preprocessing and use some data structure, but I couldn't think of.

推荐答案

您需要的是一个细分树.段树可以在O(log(n))时间内执行sum(i, j)update(i, value).

What you need is a Segment Tree. A Segment tree can perform sum(i, j) and update(i, value) in O(log(n)) time.

Wikipedia 的引用:

在计算机科学中,段树是用于存储间隔或段的树数据结构.它允许查询哪些存储段包含给定点.从原则上讲,它是静态结构;也就是说,一旦构建,便无法修改其结构.

In computer science, a segment tree is a tree data structure for storing intervals, or segments. It allows querying which of the stored segments contain a given point. It is, in principle, a static structure; that is, its structure cannot be modified once it is built.

树的叶子将是初始数组元素.他们的父母将是孩子的总和.例如:假设data[] = {2, 5, 4, 7, 8, 9, 5},那么我们的树将如下所示:

The leaves of the tree will be the initial array elements. Their parents will be the sum of their children. For example: Suppose data[] = {2, 5, 4, 7, 8, 9, 5}, then our tree will be as follows:

此树结构使用数组表示.我们将此数组称为seg_tree.因此, seg_tree [] 的根将存储在数组的索引1中.它的两个子项将存储在索引2和3中.1-indexed表示的一般趋势将是:

This tree structure is represented using arrays. Let's call this array seg_tree. So the root of the seg_tree[] will be stored at index 1 of the array. It's two children will be stored at indexes 2 and 3. The general trend for 1-indexed representation will be:

  • 索引i的左子项位于索引2*i.
  • 索引i的右子元素位于索引2*i+1.
  • 索引为i的父级位于索引为i/2.
  • Left child of index i is at index 2*i.
  • Right child of index i is at index 2*i+1.
  • Parent of index i is at index i/2.

,对于0-indexed表示形式:

  • 索引i的左子项位于索引2*i+1.
  • 索引i的右子元素位于索引2*i+2.
  • 索引为i的父级位于索引为(i-1)/2.
  • Left child of index i is at index 2*i+1.
  • Right child of index i is at index 2*i+2.
  • Parent of index i is at index (i-1)/2.

上图中的每个间隔[i, j]表示间隔data[i, j]中所有元素的总和.根节点将表示整个data []的总和,即sum(0, 6).它的两个子代将表示sum(0, 3)sum(4, 6),依此类推.

Each interval [i, j] in the above picture denotes the sum of all elements in the interval data[i, j]. The root node will denote the sum of the whole data[], i.e., sum(0, 6) . Its two children will denote the sum(0, 3) and sum(4, 6) and so on.

seg_tree [] MAX_LEN 的长度为(如果n = length of data[]):

The length of seg_tree[], MAX_LEN, will be (if n = length of data[]):

  • 2*n-1当n为2的幂时
  • 2*(2^(log_2(n)+1) - 1当n不是2的幂时.
  • 2*n-1 when n is a power of 2
  • 2*(2^(log_2(n)+1) - 1 when n is not a power of 2.

在这种情况下,我们将假定0-indexed的构造.索引0将是树的根,初始数组的元素将存储在叶中.

We will assume a 0-indexed construction in this case. Index 0 will be the root of the tree and the elements of the initial array will be stored in the leaves.

data[0...(n-1)]是初始数组,seg_tree[0...MAX_LEN] data [] 的段树表示.了解如何从伪代码构造树将更容易:

data[0...(n-1)] is the initial array and seg_tree[0...MAX_LEN] is the segment tree representation of data[]. It will be easier to understand how to construct the tree from the pseudo code:

build(node, start, end) {

    // invalid interval
    if start > end:
        return

    // leaf nodes
    if start == end:
        tree[node] = data[start]
        return

    // build left and right subtrees
    build(2*node+1, start, (start + end)/2);
    build(2*node+2, 1+(start+end)/2, end);

    // initialize the parent with the sum of its children
    tree[node] = tree[2*node+1] + tree[2*node+2]
}

在这里

  • [start, end]表示data []中要形成段树表示的间隔.最初是(0, n-1).
  • node 表示 seg_tree [] 中的当前索引.
  • [start, end] denotes the interval in data[] for which segment tree representation is to be formed. Initially, this is (0, n-1).
  • node represents the current index in the seg_tree[].

我们通过调用build(0, 0, n-1)开始构建过程.第一个参数表示根在 seg_tree [] 中的位置.第二个和第三个参数表示 data [] 中要形成分段树表示形式的间隔.在每个后续调用中, node 将代表 seg_tree [] 的索引,而(start,end)将代表 seg_tree的间隔[node] 将存储总和.

We start the building process by calling build(0, 0, n-1). The first argument denotes the position of the root in seg_tree[]. Second and third argument denotes the interval in data[] for which segment tree representation is to be formed. In each subsequent call node will represent the index of seg_tree[] and (start, end) will denote the interval for which seg_tree[node] will store the sum.

有三种情况:

  • start > end是无效的间隔,我们只是从此调用中返回.
  • 如果start == end表示 seg_tree [] 的叶子,因此我们初始化tree[node] = data[start]
  • 否则,我们处于一个有效间隔中,该间隔不是叶子.因此,我们首先通过调用build(node, start, (start + end)/2)来构建此节点的左子节点,然后通过调用build(node, 1+(start+end)/2, end)来构建右子树.然后,我们通过 seg_tree [] 中的子节点总和来初始化当前索引.
  • start > end is an invalid interval and we simply return from this call.
  • if start == end, represents the leaf of the seg_tree[] and hence, we initialize tree[node] = data[start]
  • Otherwise, we are in a valid interval which is not a leaf. So we first build the left child of this node by calling build(node, start, (start + end)/2), then the right subtree by calling build(node, 1+(start+end)/2, end). Then we initialize the current index in seg_tree[] by the sum of its child nodes.

我们需要检查节点的间隔是否与给定间隔(i,j)重叠(部分/完整),或者根本不重叠.这是三种情况:

We need to check whether the intervals at nodes overlap (partial/complete) with the given interval (i, j) or they do not overlap at all. These are the three cases:

  • 对于不存在重叠的情况,我们可以简单地返回0.
  • 对于完全重叠,我们将返回存储在该节点上的值.
  • 对于部分重叠,我们将拜访两个孩子并递归继续进行此检查.

假设我们需要找到sum(1, 5)的值.我们进行如下操作:

Suppose we need to find the value of sum(1, 5). We proceed as follows:

让我们拿一个空容器(Q),它将存储感兴趣的时间间隔.最终,所有这些范围都将被它们返回的值替换.最初,Q = {(0, 6)}.

Let us take an empty container (Q) which will store the intervals of interest. Eventually all these ranges will be replaced by the values that they return. Initially, Q = {(0, 6)}.

我们注意到(1,5)并不完全重叠(0,6),因此我们删除了该范围并添加了其子范围. Q = {(0, 3), (4, 6)}

We notice that (1, 5) does not completely overlap (0, 6), so we remove this range and add its children ranges. Q = {(0, 3), (4, 6)}

现在,(1,5)部分重叠(0,3).因此,我们删除(0,3)并插入其两个子代. Q = {(0, 1), (2, 3), (4, 6)}

Now, (1, 5) partially overlaps (0, 3). So we remove (0, 3) and insert its two children. Q = {(0, 1), (2, 3), (4, 6)}

(1,5)部分重叠(0,1),因此我们将其删除并插入其两个子范围. Q = {(0, 0), (1, 1), (2, 3), (4, 6)}

(1, 5) partially overlaps (0, 1), so we remove this and insert its two children range. Q = {(0, 0), (1, 1), (2, 3), (4, 6)}

现在(1,5)不与(0,0)重叠,因此我们将(0,0)替换为它将返回(由于没有重叠,所以为0).问= {(0, (1, 1), (2, 3), (4, 6)}

Now (1, 5) does not overlap (0, 0), so we replace (0, 0) with the value that it will return (which is 0 because of no overlap). Q = {(0, (1, 1), (2, 3), (4, 6)}

接下来,(1,5)(1,1)完全重叠,因此我们返回存储在代表该范围的节点中的值(即5) . Q = {0, 5, (2, 3), (4, 6)}

Next, (1, 5) completely overlaps (1, 1), so we return the value stored in the node that represents this range (i.e., 5). Q = {0, 5, (2, 3), (4, 6)}

接下来,(1,5)再次完全重叠(2,3),因此我们返回值11.Q = {0, 5, 11, (4, 6)}

Next, (1, 5) again completely overlaps (2, 3), so we return the value 11. Q = {0, 5, 11, (4, 6)}

接下来,(1,5)部分重叠(4,6),因此我们将其范围替换为其两个子级. Q = {0, 5, 11, (4, 5), (6, 6)}

Next, (1, 5) partially overlaps (4, 6) so we replace this range by its two children. Q = {0, 5, 11, (4, 5), (6, 6)}

快速转发操作,我们注意到(1,5)(4,5)完全重叠,因此我们将其替换为17和(1, 5)不与(6,6)重叠,因此我们将其替换为0.最后是Q = {0, 5, 11, 17, 0}.该查询的答案是Q中所有元素的总和,即33.

Fast forwarding the operations, we notice that (1, 5) completely overlaps (4, 5) so we replace this by 17 and (1, 5) does not overlap (6, 6), so we replace it with 0. Finally, Q = {0, 5, 11, 17, 0}. The answer of the query is the sum of all the elements in Q, which is 33.

对于update(i, value),此过程有点类似.首先,我们将搜索范围(i,i).我们在此路径中遇到的所有节点也将需要更新.让change = (new_value - old_value).然后,在搜索范围(i,i)的过程中遍历树时,我们会将此更改添加到所有节点(最后一个节点除外),最后一个节点将简单地替换为新值.例如,让查询为update(5, 8).

For update(i, value), the process is somewhat similar. First we will search for the range (i, i). All the nodes that we encounter in this path will also need to be updated. Let change = (new_value - old_value). Then while traversing the tree in the search of the range (i, i), we will add this change to all those nodes except the last node which will simply be replaced by the new value. For example, let the query be update(5, 8).

change = 8-9 = -1.

遇到的路径将是(0, 6) -> (4, 6) -> (4, 5) -> (5, 5).

(0, 6) = 40 + change = 40 - 1 = 39的最终值.

(4, 6) = 22 + change = 22 - 1 = 21的最终值.

(4, 5) = 17 + change = 17 - 1 = 16的最终值.

(5, 5) = 8的最终值. 最终的树将如下所示:

Final Value of (5, 5) = 8. The final tree will look like this:

我们可以在O(n)时间内使用数组创建分段树表示,并且这两个操作的时间复杂度都为O(log(n)).

We can create a segment tree representation using arrays in O(n) time and both of these operations have a time complexity of O(log(n)).

一般而言,段树可以有效地执行以下操作:

In general Segment Trees can perform the following operations efficiently:

  • 更新:它可以更新:
    • 给定索引的元素.
    • 将所有元素间隔为给定值.在这种情况下,通常采用 Lazy Propagation 技术来实现效率.
    • Update: It can update:
      • an element at a given index.
      • all the elements in an interval to a given value. Lazy Propagation technique is generally employed in this case to achieve efficiency.
      • 间隔中的最小元素
      • 间隔中的最大元素
      • 区间中所有元素的总和/积

      另一个数据结构间隔树也可以用于解决此问题.建议读物:

      Another data structure Interval Tree can also be used to solve this problem. Suggested Readings:

      • Wikipedia Page on Segment Tree
      • PEGWiki Page on Segment Tree
      • HackerEarth notes on Segment Tree
      • Visualization of Segment Tree Operations

      这篇关于优化sum(i,j)和update(i,value)以获取整数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆