是否可以在 O(lg N) 范围内查询不同整数的数量? [英] Is it possible to query number of distinct integers in a range in O(lg N)?

查看:25
本文介绍了是否可以在 O(lg N) 范围内查询不同整数的数量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了一些关于两种常见数据结构的教程,它们可以在 O(lg N) 中实现范围更新和查询:段树二元索引树(BIT/Fenwick Tree).

I have read through some tutorials about two common data structure which can achieve range update and query in O(lg N): Segment tree and Binary Indexed Tree (BIT / Fenwick Tree).

我发现的大多数示例都是关于一些关联和交换操作,例如范围内的整数总和"、范围内的整数异或"等.

Most of the examples I have found is about some associative and commutative operation like "Sum of integers in a range", "XOR integers in a range", etc.

不知道这两种数据结构(或任何其他数据结构/算法,请提出)是否可以在O(lg N)中实现以下查询?(如果不是,O(sqrt N) 怎么样)

I wonder if these two data structures (or any other data structures / algorithm, please propose) can achieve the below query in O(lg N)? (If no, how about O(sqrt N))

给定一个整数A数组,查询[l,r]范围内不同整数的个数

PS:假设可用整数的数量是 ~ 10^5,所以 used[color] = true 或位掩码是不可能的

PS: Assuming the number of available integer is ~ 10^5, so used[color] = true or bitmask is not possible

例如:A = [1,2,3,2,4,3,1],query([2,5]) = 3,其中范围索引从 0 开始.

For example: A = [1,2,3,2,4,3,1], query([2,5]) = 3, where the range index is 0-based.

推荐答案

是的,这可以在 O(log n) 中完成,即使您应该在线回答查询.然而,这需要一些相当复杂的技术.

Yes, this is possible to do in O(log n), even if you should answer queries online. However, this requires some rather complex techniques.

首先,让我们解决以下问题:给定一个数组,回答形式为索引 [l, r] 内有多少个数字 <= x"的查询.这是通过一个类似于段树的结构来完成的,这种结构有时被称为合并排序树.它基本上是一个段树,其中每个节点存储一个排序的子数组.这种结构需要 O(n log n) 的内存(因为有 log n 层,每层都需要存储 n 个数字).它也是在 O(n log n) 中构建的:您只需自下而上,并为每个内部顶点合并其子项的排序列表.

First, let's solve the following problem: given an array, answer the queries of form "how many numbers <= x are there within indices [l, r]". This is done with a segment-tree-like structure which is sometimes called Merge Sort Tree. It is basically a segment tree where each node stores a sorted subarray. This structure requires O(n log n) memory (because there are log n layers and each of them requires storing n numbers). It is built in O(n log n) as well: you just go bottom-up and for each inner vertex merge sorted lists of its children.

这是一个例子.假设 1 5 2 6 8 4 7 1 是一个原始数组.

Here is an example. Say 1 5 2 6 8 4 7 1 be an original array.

|1 1 2 4 5 6 7 8|
|1 2 5 6|1 4 7 8|
|1 5|2 6|4 8|1 7|
|1|5|2|6|8|4|7|1|

现在您可以在 O(log^2 n 次) 内回答这些查询:只需对线段树进行常规查询(遍历 O(log n) 个节点)并进行二分搜索即可知道有多少个数字

= x 在那个节点中(从这里额外的 O(log n)).

Now you can answer for those queries in O(log^2 n time): just make a reqular query to a segment tree (traversing O(log n) nodes) and make a binary search to know how many numbers <= x are there in that node (additional O(log n) from here).

使用 Fractional Cascading 技术,这可以加速到 O(log n),基本上允许您不是在每个节点中而是仅在根中进行二分搜索.然而,它足够复杂,可以在帖子中描述.

This can be speed up to O(log n) using Fractional Cascading technique, which basically allows you to do the binary search not in each node but only in the root. However it is complex enough to be described in the post.

现在我们回到原来的问题.假设您有一个数组 a_1, ..., a_n.构建另一个数组 b_1, ..., b_n,其中 b_i = 数组中下一次出现 a_i 的索引,如果是最后一次出现则为 ∞.

Now we return to the original problem. Assume you have an array a_1, ..., a_n. Build another array b_1, ..., b_n, where b_i = index of the next occurrence of a_i in the array, or ∞ if it is the last occurrence.

示例(1 索引):

a = 1 3 1 2 2 1 4 1
b = 3 ∞ 6 5 ∞ 8 ∞ ∞

现在让我们计算 [l, r] 中的数字.对于每个唯一编号,我们将计算其在段中的最后一次出现次数.使用 b_i 概念,您可以看到当且仅当 b_i >r.因此,问题归结为段 [l, r] 中有多少个数字 > r",这可以简化为我上面描述的内容.

Now let's count numbers in [l, r]. For each unique number we'll count its last occurrence in the segment. With b_i notion you can see that the occurrence of the number is last if and only if b_i > r. So the problem boils down to "how many numbers > r are there in the segment [l, r]" which is trivially reduced to what I described above.

希望有帮助.

这篇关于是否可以在 O(lg N) 范围内查询不同整数的数量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆