从Java数组获得前四名的最大值 [英] Getting Top Four Maximum value from Java Array

查看:332
本文介绍了从Java数组获得前四名的最大值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找到从整型数组输入前4最大值。例如,对于给定的输入数组{1232,-1221,0,345,78,99}将返回{1232,345,99,78}的前4最大值。我已经解决了与以下以下的方法的要求。但我仍然不能满足其时间效率。是否有任何机会,以优化方法多为输入变大?任何线索是真的AP preciated。谢谢你。

 公众诠释[] findTopFourMax(INT []输入){
INT [] topFourList = {Integer.MIN_VALUE的,Integer.MIN_VALUE的,Integer.MIN_VALUE的,Integer.MIN_VALUE的};
对于(INT电流输入){
    如果(当前和GT; topFourList [0]){
        topFourList [3] = topFourList [2];
        topFourList [2] = topFourList [1];
        topFourList [1] = topFourList [0];
        topFourList [0] =电流;
    }否则如果(电流和GT; topFourList [1]){
        topFourList [3] = topFourList [2];
        topFourList [2] = topFourList [1];
        topFourList [1] =电流;
    }否则如果(电流和GT; topFourList [2]){
        topFourList [3] = topFourList [2];
        topFourList [2] =电流;
    }否则如果(电流和GT; topFourList [3]){
        topFourList [3] =电流;
    }
}
返回topFourList;
 

}

<类=h2_linDIV>解决方案

最简单(虽然不是最有效)的方式将是排序数组取子阵含过去的4个元素。

您可以使用<一个href="http://docs.oracle.com/javase/6/docs/api/java/util/Arrays.html#sort%28int%5B%5D%29"><$c$c>Arrays.sort()排序和<一href="http://docs.oracle.com/javase/6/docs/api/java/util/Arrays.html#copyOfRange%28int%5B%5D,%20int,%20int%29"><$c$c>Arrays.copyOfRange()取子阵。

  INT [] ARR =新INT [] {1232,-1221,0,345,78,99};
Arrays.sort(ARR);
INT [] TOP4 = Arrays.copyOfRange(ARR,arr.length-4,arr.length);
的System.out.println(Arrays.toString(TOP4));
 


有关更高效的解决方案,可以维持 分堆 顶k个元素或使用 选择算法 找到前四名元素。这两种方法在描述这个线程

虽然选择算法报价 O(N)解决方案,分堆的解决方案(即 O(nlogK))应该有更好的常量,尤其是对小 K 很可能会更快。

P.S。 (编辑):

有关4个元素,你可能会发现,调用循环4次,并找到一个最大在他们每个人(和改变旧的最大值为负无穷在每次迭代)将更有效率则更为复杂的办法,因为它需要连续读取,并有相当小的常数。当然,这是对于较大的 K 不正确的,衰变成为O(n ^ 2) K-&GT; N


EDIT2:基准测试:

有它的乐趣,我跑了附加code基准。结果是:

  [天真,排序,堆] = [9032,214902,7531]
 

我们可以看到,天真和堆好得多然后根据排序方式,而天真的是稍微慢一点,然后堆为主。我做了一个 Wilcoxon检验检查天真和堆之间的差异有统计学显著,我得到的P_Value 3.4573e-17 。这意味着,两种方法的概率是相同的是3.4573e-17(非常小)。由此我们可以得出结论 - 基于堆的解决方案提供了更好的性能,那么天真和排序的解决方案(和我们经验证明了这一点!)

附件:code我用:

 公共静态INT [] findTopKNaive(INT []改编,INT K){
    INT [] RES =新INT [K]
    对于(INT J = 0; J&LT; k; J ++){
        INT最大= Integer.MIN_VALUE的,maxIdx = -1;
        的for(int i = 0; I&LT; arr.length;我++){
            如果(最大值&其中;常用3 [I]){
                最大= ARR [I]
                maxIdx =我;
            }
        }
        ARR [maxIdx] = Integer.MIN_VALUE的;
        水库[K-1,J] = MAX;
    }
    返回水库;
}

公共静态INT [] findTopKSort(INT []改编,INT K){
    Arrays.sort(ARR);
    返回Arrays.copyOfRange(ARR,arr.length-K,arr.length);
}

公共静态INT [] findTopKHeap(INT []改编,INT K){
    的PriorityQueue&LT;整数GT; PQ =新的PriorityQueue&LT;整数GT;();
    对于(INT X:ARR){
        如果(pq.size()&LT; K)pq.add(X);
        否则,如果(pq.peek()&LT; X){
            pq.poll();
            pq.add(X);
        }
    }
    INT [] RES =新INT [K]
    的for(int i = 0; I&LT; k;我++)水库[我] = pq.poll();
    返回水库;

}
公共静态INT [] createRandomArray(INT N,随机R){
    INT [] ARR =新INT [N];
    的for(int i = 0;我n种;我++)改编[我] = r.nextInt();
    返回ARR;
}
公共静态无效的主要(字符串参数... args)抛出异常{
    随机R =新随机(1);
    INT K = 4;
    INT重复= 200;
    INT N = 5000000;
    长[] []结果=新长[3] [重复]
    的for(int i = 0; I&LT;重复;我++){
        INT [] ARR = createRandomArray(N,R);
        INT [] myCopy;
        myCopy = Arrays.copyOf(ARR,N);
        长时间启动= System.currentTimeMillis的();
        findTopKNaive(myCopy中,k);
        结果[0] [我] = System.currentTimeMillis的() - 启动;
        myCopy = Arrays.copyOf(ARR,N);
        开始= System.currentTimeMillis的();
        findTopKSort(myCopy中,k);
        结果[1] [I] = System.currentTimeMillis的() - 启动;
        myCopy = Arrays.copyOf(ARR,N);
        开始= System.currentTimeMillis的();
        findTopKHeap(myCopy中,k);
        结果[2] [I] = System.currentTimeMillis的() - 启动;
    }
    长[]款项=新长[3]。
    的for(int i = 0; I&LT;重复;我++)
        为(诠释J = 0; J&所述; 3; J ++)
        款项[J] + =结果[J] [我]
    的System.out.println(Arrays.toString(款项));

    的System.out.println(结果统计检验:);
    的for(int i = 0; I&LT;重复;我++){
        的System.out.println(结果[0] [i]于++效果[2] [I]);
    }
}
 

I am trying to find top 4 maximum value from integer array input. For example for given input array {1232, -1221, 0, 345, 78, 99} will return {1232, 345, 99, 78} as a top 4 maximum value. I have solved the requirement with following method below. But I am still not satisfy with its time efficiency. Is there any chance to optimize the method more as the input become larger? Any clues are really appreciated. Thank you.

public int[] findTopFourMax(int[] input) {
int[] topFourList = { Integer.MIN_VALUE, Integer.MIN_VALUE, Integer.MIN_VALUE,       Integer.MIN_VALUE };
for (int current : input) {
    if (current > topFourList[0]) {
        topFourList[3] = topFourList[2];
        topFourList[2] = topFourList[1];
        topFourList[1] = topFourList[0];
        topFourList[0] = current;
    } else if (current > topFourList[1]) {
        topFourList[3] = topFourList[2];
        topFourList[2] = topFourList[1];
        topFourList[1] = current;
    } else if (current > topFourList[2]) {
        topFourList[3] = topFourList[2];
        topFourList[2] = current;
    } else if (current > topFourList[3]) {
        topFourList[3] = current;
    }
}
return topFourList;

}

解决方案

Simplest (though not most efficient) way will be to sort the array at take the subarray containing the last 4 elements.

You can use Arrays.sort() to sort and Arrays.copyOfRange() to take the subarray.

int[] arr = new int[] {1232, -1221, 0, 345, 78, 99};
Arrays.sort(arr);
int[] top4 = Arrays.copyOfRange(arr, arr.length-4,arr.length);
System.out.println(Arrays.toString(top4));


For more efficient solution, one can maintain a min-heap of top K elements or use selection algorithm to find the top 4th element. The two approaches are described in this thread.

Though the selection algorithm offers O(n) solution, the min-heap solution (which is O(nlogK)) should have better constants, and especially for small k is likely to be faster.

P.S. (EDIT):

For 4 elements, you might find that invoking a loop 4 times, and finding a max in each of them (and changing the old max to -infinity in each iteration) will be more efficient then the more "complex" approaches, since it requires sequential reads and have fairly small constants. This is of course not true for larger k, and decays into O(n^2) for k->n


EDIT2: benchmarking:

for the fun of it, I ran a benchmark on the attached code. The results are:

[naive, sort, heap] = [9032, 214902, 7531]

We can see that the naive and heap are much better then the sort based approach, and the naive is slightly slower then the heap based. I did a wilcoxon test to check if the difference between naive and heap is statistically significant, and I got a P_Value of 3.4573e-17. This means that the probability of the two approaches are "identical" is 3.4573e-17 (extremely small). From this we can conclude - heap based solution gives better performance then naive and sorting solution (and we empirically proved it!).

Attachment: The code I used:

public static int[] findTopKNaive(int[] arr, int k) {
    int[] res = new int[k];
    for (int j = 0; j < k; j++) { 
        int max=Integer.MIN_VALUE, maxIdx = -1;
        for (int i = 0; i < arr.length; i++) { 
            if (max < arr[i]) { 
                max = arr[i];
                maxIdx = i;
            }
        }
        arr[maxIdx] = Integer.MIN_VALUE;
        res[k-1-j] = max;
    }
    return res;
}

public static int[] findTopKSort(int[] arr, int k) { 
    Arrays.sort(arr);
    return Arrays.copyOfRange(arr, arr.length-k,arr.length);
}

public static int[] findTopKHeap(int[] arr, int k) { 
    PriorityQueue<Integer> pq = new PriorityQueue<Integer>();
    for (int x : arr) { 
        if (pq.size() < k) pq.add(x);
        else if (pq.peek() < x) {
            pq.poll();
            pq.add(x);
        }
    }
    int[] res = new int[k];
    for (int i =0; i < k; i++) res[i] = pq.poll();
    return res;

}
public static int[] createRandomArray(int n, Random r) { 
    int[] arr = new int[n];
    for (int i = 0; i < n; i++) arr[i] = r.nextInt();
    return arr;
}
public static void main(String... args) throws Exception {
    Random r = new Random(1);
    int k = 4;
    int repeats = 200;
    int n = 5000000;
    long[][] results = new long[3][repeats];
    for (int i = 0; i < repeats; i++) { 
        int[] arr = createRandomArray(n, r);
        int[] myCopy;
        myCopy = Arrays.copyOf(arr, n);
        long start = System.currentTimeMillis();
        findTopKNaive(myCopy, k);
        results[0][i] = System.currentTimeMillis() - start;
        myCopy = Arrays.copyOf(arr, n);
        start = System.currentTimeMillis();
        findTopKSort(myCopy, k);
        results[1][i] = System.currentTimeMillis() - start;
        myCopy = Arrays.copyOf(arr, n);
        start = System.currentTimeMillis();
        findTopKHeap(myCopy, k);
        results[2][i] = System.currentTimeMillis() - start;
    }
    long[] sums = new long[3];
    for (int i = 0; i < repeats; i++) 
        for (int j = 0; j < 3; j++)
        sums[j] += results[j][i];
    System.out.println(Arrays.toString(sums));

    System.out.println("results for statistic test:");
    for (int i = 0; i < repeats; i++) { 
        System.out.println(results[0][i] + " " + results[2][i]);
    }
}

这篇关于从Java数组获得前四名的最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆