高效的算法产生用C分类数组的交集n路 [英] Efficient algorithm to produce the n-way intersection of sorted arrays in C

查看:99
本文介绍了高效的算法产生用C分类数组的交集n路的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要产生整数的C中一些有序阵列我知道如何找到两个已排序阵列之间的交点之间的交点,但我需要对两个以上的阵列,有效和无数的先验知识要这样做的阵列。我可以在最大数量强加一个合理的限度 - 让我们说十个项目现在。这些阵列可以是从几项目的任意位置的一对夫妇的十万件长,并且决不是一定的长度相同。

I need to produce the intersection between some sorted arrays of integers in C. I know how to find the intersection between two sorted arrays, but I need to do this for more than two arrays, efficiently and without prior knowledge of the number of arrays. I can impose a sensible limit on the maximum number - let's say ten for now. These arrays could be anywhere from a few items to a couple of hundred thousand items long, and are by no means necessarily the same length.

伪code生产两个排序数组的交集:

Pseudo-code for producing the intersection of two sorted arrays:

while i < m and j < n do:
    if array1[i] < array2[j]:
        increment i
    else if array1[i] > array2[j]: 
        increment j
    else 
        add array1[i] to intersection(array1, array2)
        increment i
        increment j

我用C的工作,我很清楚的解释,而不是code后。

I am working with C, and I am after a clear explanation rather than code.

推荐答案

我假设你所有的数组进行排序。让我们假设我们有数组 A_1 A_N 。为每个阵列(因此,我们有 N 专柜 I_1 i_n ,就像你这样做是为了两个数组)。

I assume that all your arrays are sorted. Lets assume we have arrays A_1 to A_n. Have a counter for each array (thus, we have n counters i_1 to i_n, just like you did it for two arrays).

现在我们引入一个最小堆,它包含的方式在整个阵列,使得最低阵列是阵列与当前最低编号所指向的相应指针。这意味着,我们可以在每一时刻,检索数组与当前最低的数字指出。

Now we introduce a minimum-heap, that contains the whole arrays in a manner such that the minimum array is that array with the currently lowest number pointed to by the corresponding pointer. This means, we can at each moment, retrieve the array with the currently lowest number pointed to.

现在,我们提取堆最小阵列和记住它。我们继续只要提取的最小阵列作为数字指出保持不变。如果我们提取的所有的阵列(即如果所有的数组具有目前最低指出数相同),我们知道,这个数字是在路口。如果不是(即如果不是所有的数组确实包含目前最低指出数相同),我们知道,我们目前正在研究的数字不能在路口。因此,我们增加所有计数器已提取的数组,并把它们放回堆中。

Now, we extract the minimum array from the heap and remember it. We go on extracting the minimum array as long as the number pointed to stays the same. If we extract all arrays (i.e. if all arrays have the same currently lowest pointed to number), we know that this number is in the intersection. If not (i.e. if not all arrays do contain the same currently lowest pointed to number), we know that the number we are currently examining can not be in the intersection. Thus, we increment all counters to the arrays already extracted and put them back into the heap.

我们这样做,直到我们找到一个数组的指针到达数组的末尾。我的不详细描述抱歉,但我没有足够的时间来解决它的更多细节。

We do this until we find one array's pointer reaching the array's end. I'm sorry for the undetailed description, but I do not have enough time to work it out in more detail.

如果你有一个阵列非常少的元素,这可能是有用的,只是二进制搜索这些号码的其他阵列或使用哈希表的检查这些数字

If you have one array with very few elements, it might be useful to just binary-search the other arrays for these numbers or checking these numbers using a hash table.

这篇关于高效的算法产生用C分类数组的交集n路的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆