比较两个排序INT数组 [英] Comparing two sorted int arrays

查看:99
本文介绍了比较两个排序INT数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几百万固定大小(100),INT阵列。每个数组进行排序,并具有独特的元素。对于每个阵列,我想找到这有70%的通用元素所有阵列。现在,我得到大约100万的比较(使用Arrays.binarySearch())每秒,这对我们来说是太慢了。

I have millions of fixed-size (100) int arrays. Each array is sorted and has unique elements. For each array, I want to find all arrays which have 70% common elements. Right now I am getting around 1 million comparisons (using Arrays.binarySearch()) per second, which is too slow for us.

谁能推荐一个更好的搜索算法?

Can anyone recommend a better searching algorithm?

推荐答案

这样的事情应该做的工作(前提是数组排序,并含有独特的元素):

Something like this should do the job (provided that the arrays are sorted and contain unique elements):

public static boolean atLeastNMatchingElements(final int n,
    final int[] arr1,
    final int[] arr2){

    /* check assumptions */
    assert (arr1.length == arr2.length);

    final int arrLength = arr2.length;

    { /* optimization */
        final int maxOffset = Math.max(arrLength - n, 0);
        if(arr1[maxOffset] < arr2[0] || arr2[maxOffset] < arr1[0]){
            return false;
        }
    }

    int arr2Offset = 0;
    int matches = 0;

    /* declare variables only once, outside loop */
    int compResult; int candidate;

    for(int i = 0; i < arrLength; i++){
        candidate = arr1[i];
        while(arr2Offset < arrLength - 1){
            compResult = arr2[arr2Offset] - candidate;
            if(compResult > 0){
                break;
            } else{
                arr2Offset++;
                if(compResult == 0){
                    matches++;
                    break;
                }
            }
        }
        if(matches == n){
            return true;
        }
        /* optimization */
        else if(matches < n - (arrLength - arr2Offset)){
            return false;
        }
    }
    return false;
}

示例用法:

public static void main(final String[] args){
    final int[] arr1 = new int[100];
    final int[] arr2 = new int[100];
    int x = 0, y = 0;
    for(int i = 0; i < 100; i++){
        if(i % 10 == 0){
            x++;
        }
        if(i % 12 == 0){
            y++;
        }
        arr1[i] = x;
        arr2[i] = y;
        x++;
        y++;
    }
    System.out.println(atLeastNMatchingElements(70, arr1, arr2));
    System.out.println(atLeastNMatchingElements(95, arr1, arr2));
}

输出:

真正的结果
  假

true
false

我现在试图优化上述code。请检查code块是否标记为

Premature Optimizations™

I have now tried to optimize the above code. Please check whether the code blocks marked as

/* optimization */

做一个明显的区别。优化后,我会重构code把它降低到一个或两个return语句。

make a noticeable difference. After optimization, I would refactor the code to get it down to one or two return statements.

这篇关于比较两个排序INT数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆