阵列运动 [英] An array exercise

查看:130
本文介绍了阵列运动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图解决这个问题: 给定的排序数组,其中包含的连续从0开始的整数(一个整数,可以重复多次),例如 - 0,0,0,1, 2,3,3,3,4,4 (可能很长也 - 这只是一个例子),有效地找到一个给定的整数的起始和结束指数

我想用

1)遍历(复杂性= O(N)

2)修改后的二进制搜索(复杂性= O(log n)的)。 [N =总数组长度]

然后在想,如果连续整数属性可以被利用来解决这个问题。 任何不同的意见或建议?

解决方案

首先,让我们忽略了连续性财产

只要问题就是找到最有效的方式来处理的的个体请求,直接的一般的解决办法是执行两个连续的二进制搜索:第一个发现的开头序列中,第二个发现该序列的末端。第二搜索设置在所述阵列的剩余部分进行的,也就是到previously发现该序列的开头的右侧。

不过,如果你不小心的知道的该序列的平均长度是比较小的,然后它开始变得有意义替换为的线性的搜索第二二进制搜索。 (这是合并相似长度的两个已排序的序列时的工作原理相同的原理:线性搜索胜过二进制搜索,因为输入的结构保证平均的搜索的目标靠近该序列的开头)<。 / P>

更确切地讲,如果整个数组的长度 N 和不同的整数值数组(各种公制)的数量是 K ,然后线性搜索开始跑赢大盘,平均二进制搜索时, N / K LOG2(n)的小(一些依赖于实现的常数因子可能需要拿出实际的关系)。

极端的例子,说明这种效果的情况是,当 N =ķ,即当阵列中的所有值是不同的。显然,使用线性搜索以找到每个序列的末端(一旦知道开始时)将大大比使用二进制搜索更有效。

但是,这东西,需要对输入数组的属性额外的知识。我们需要知道 K

这是当你的连续性的属性开始发挥作用!

由于号码是连续的,该阵列减去阵列中的第一个值中的最后一个值是等于 k-1个的,这意味着

  K =数组[n-1]  - 阵列[0] + 1
 

此规则也可应用到任何子阵列的原始阵列来计算各种量度为该子阵列

这已经给你一个非常可行和有效的算法用于查找顺序:先执行二进制搜索该序列的开始,然后再进行任何的线性搜索取决于之间的关系 N K (或者,甚至更好,右子长度之间阵列和右子阵列的各种度量)。

P.S。同样的技术可以应用到第一个搜索,以及。如果您正在寻找序列,那么你马上就知道它是阵列中的Ĵ -th序列,其中 J = - 阵列[0] 。这意味着,该序列的开始线性搜索将平均Ĵ* N / K 步骤。如果该值小于 LOG2(N),线性搜索可能比二进制搜索一个更好的主意。

I was trying to solve this : Given a sorted array that contains continuous integers starting from 0(one integer may be repeated many times) eg - 0,0,0,1,2,3,3,3,4,4(can be very long also - this is just an example) , efficiently find the starting and ending indices of a given integer.

I am thinking of using

1)traversal(complexity = O(n))

2) a modified binary search(complexity =O(log n)). [ n = length of total array]

Then was wondering if the continuous integers property could be utilized to solve it. Any different ideas or suggestions ?

解决方案

To begin, let's ignore the "continuity" property

As long as the problem is about finding the most efficient way to handle a single individual request, the straightforward general solution would be to perform two consecutive binary searches: the first one finds the beginning of the sequence, the second one finds the end of the sequence. The second search is performed in the remainder of the array, i.e. to the right of the previously found beginning of the sequence.

However, if you somehow know that the average length of the sequence is relatively small, then it begins to make sense to replace the second binary search with a linear search. (This is the same principle that works when merging two sorted sequences of similar length: linear search outperforms binary search, because the structure of the input guarantees that on average the target of the search is located close to the beginning of the sequence).

More formally, if the length of the whole array is n and the number of different integer values in the array (variety metric) is k, then linear search begins to outperform binary search on average when n/k becomes smaller than log2(n) (some implementation-dependent constant factors might be needed to come up with a practical relationship).

The extreme example that illustrates this effect is the situation when n=k, i.e. when all values in the array are different. Obviously, using the linear search to find the end of each sequence (once you know the beginning) will be vastly more efficient than using binary search.

But that's something that requires extra knowledge about the properties of the input array: we need to know k.

And this is when your "continuity" property comes into play!

Since the numbers are continuous, the last value in the array minus the first value in the array is equal to k-1, meaning that

k = array[n-1] - array[0] + 1

This rule can also be applied to any sub-array of your original array to calculate the variety metric for that sub-array.

That already gives you a very viable and efficient algorithm for finding the sequence: first perform a binary search for the beginning of the sequence, and then perform either binary or linear search depending on the relationship between n and k (or, even better, between the length of the right sub-array and the variety metric of the right sub-array).

P.S. The same technique can be applied to the first search as well. If you are looking for sequence of i, then you immediately know that it is the j-th sequence in the array, where j = i - array[0]. That means that the linear search for the beginning of that sequence will take j * n/k steps on average. If this value is smaller than log2(n), linear search might be a better idea than binary search.

这篇关于阵列运动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆