查找最大/最小值出现在整数数组 [英] Find max/min occurrence in integer array

查看:134
本文介绍了查找最大/最小值出现在整数数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚写完的发现在输入整型数组,最大/最小出现值的算法。我的想法是排序的数组(所有的事件现在是在序列),并使用<价值:出现> 对存储的每一个值出现对应的数量。

这应该是 O(nlogn)复杂性,但我认为有一些常数乘法器。我能做些什么来提高性能?​​

 的#include<&stdio.h中GT;
#包括LT&;&stdlib.h中GT;
#包括e7_8.h#定义N 20
/ *结构的<价值,frequencies_count>对*/
typedef结构{
    int值;
    INT频率;
} VAL_FREQ;
无效get_freq(INT * V,整数N,为int * most_freq,为int * less_freq){    INT V_I,vf_i,current_value,current_freq;    VAL_FREQ * SP =的malloc(N * sizeof的(VAL_FREQ));
    如果(SP == NULL)出口(EXIT_FAILURE);    归并排序(V,N);    vf_i = 0;
    current_value = V [0];
    current_freq = 1;
    为(V_I = 1; V_I&所述; n + 1个; V_I ++){
        如果(V [V-I] == current_value)current_freq ++;
        其他{
            SP [vf_i] .value的= current_value;
            SP。[vf_i ++] =频率current_freq;
            current_value = V [V-I]
            current_freq = 1;
        }
    }
    / *查找最大值,最小值频率* /
    INT I,max_freq_val,max_freq,min_freq_val,min_freq;    max_freq = SP [0] .freq;
    max_freq_val = SP [0] .value的;
    min_freq = SP [0] .freq;
    min_freq_val = SP [0] .value的;
    对于(i = 1; I< vf_i;我++){
        如果(SP [I] .freq> max_freq){
            max_freq = SP [I] .freq;
            max_freq_val = SP [I] .value的;
        }
        如果(SP [I] .freq< min_freq){
            min_freq = SP [I] .freq;
            min_freq_val = SP [I] .value的;
        }
    }    * most_freq = max_freq_val;
    * less_freq = min_freq_val;    免费(SP);
}


解决方案

让我们从一个事实,即你的算法已经为O(n *的log(n)),因为每一步都为O(n),除了排序是开始O(N *的log(n))。如果能够提高显著取决于你希望哪些类型的输入。 修改除非,这似乎是的情况下,它是不具有排序的值的规定部分(在任何情况下,由值,而不是按出现的数目)在过程的结束时,在这种情况下千万不要错过奥利查尔斯沃思的答案。

有地上2概念:第一是多少样本,你会得到(N);二是如何集中是他们的价值观,如何窄或宽就是这些数值可以分布范围(W = MAX_VALUE - MIN_VALUE)。

如果n是小于W(所以你的价值观是稀疏),比你的做法已经是最佳的,有改进的空间不大。

但是,如果w是小且n为大,则大有收获用以下方法

比方说,你知道你不能比MIN_VALUE得到任何价值的,也没有比价值更MAX_VALUE。然后,您可以使用值作为索引,你收集你的频率数组。通过这种方式,你跳过排序步骤(O(N *的log(n))),而你计算为O的频率(N)。

  INT buffer_frequencies [MAX_VALUE  -  MIN_VALUE + 1];//现在重置阵列像memset的一些方便的功能为int * value_frequencies = buffer_frequencies;
value_frequencies - = MIN_VALUE; //移阵列的开始,以便
                                //你可以直接使用该值作为数组索引
//你被允许使用负索引
为(V_I = 0; V_I&下; N; V_I ++){
  value_frequencies [V [V_I] ++;
  }

或者甚至(可能轻微更快循环版本的,但通常是一个好编译器将已经转换以最有效的版本):

 为int * p_v = V;
为int * end_p_v = V + N;
为(; p_v&下; end_p_v; p_v ++){
  [* p_v] ++ value_frequencies;
  }

请注意,这种方法(两个版本)是非常细腻的输入值,即你将打破界限的内存,如果你获得超出MIN_VALUE或MAX_VALUE的值

然后该算法的第二部分:

  //首先循环可以优化,但它没有任何影响
INT I = MIN_VALUE;
max_freq = value_frequencies [I]
max_freq_val = I;
min_freq = value_frequencies [I]
min_freq_val = I;
对于(; I< MAX_VALUE;我++){
    max_freq_val =(value_frequencies [I]≥max_freq)?我:max_freq_val;
    max_freq =(value_frequencies [I]≥max_freq)? value_frequencies [I]:max_freq;
    min_freq_val =(value_frequencies [1] - ; min_freq)?我:min_freq_val;
    min_freq =(value_frequencies [1] - ; min_freq)? value_frequencies [I]:min_freq;
    }
}

I just finished writing an algorithm that finds values in an input integer array with max/min occurrences. My idea is to sort the array (all the occurrences are now in sequence) and use a <value:occurrences> pair to store for every value the number of occurrences correspondent.

It should be O(nlogn) complexity but I think that there are some constant multipliers. What can I do to improve performance?

#include <stdio.h>
#include <stdlib.h>
#include "e7_8.h"

#define N 20
/*Structure for <value, frequencies_count> pair*/
typedef struct {
    int value;
    int freq;
} VAL_FREQ;


void  get_freq(int *v, int n, int *most_freq, int *less_freq) {

    int v_i, vf_i, current_value, current_freq;

    VAL_FREQ* sp = malloc(n*sizeof(VAL_FREQ));
    if(sp == NULL) exit(EXIT_FAILURE);

    mergesort(v,n);

    vf_i = 0;
    current_value = v[0];
    current_freq = 1;
    for(v_i=1; v_i<n+1; v_i++) {
        if(v[v_i] == current_value) current_freq++;
        else{
            sp[vf_i].value = current_value;
            sp[vf_i++].freq = current_freq;
            current_value = v[v_i];
            current_freq = 1;
        }
    }
    /*Finding max,min frequency*/
    int i, max_freq_val, max_freq, min_freq_val, min_freq;

    max_freq = sp[0].freq;
    max_freq_val = sp[0].value;
    min_freq = sp[0].freq;
    min_freq_val = sp[0].value;
    for(i=1; i<vf_i; i++) {
        if(sp[i].freq > max_freq) {
            max_freq = sp[i].freq;
            max_freq_val = sp[i].value;
        }
        if(sp[i].freq < min_freq) {
            min_freq = sp[i].freq;
            min_freq_val = sp[i].value;
        }
    }

    *most_freq = max_freq_val;
    *less_freq = min_freq_val;

    free(sp);
}

解决方案

Let's start from the fact that your algorithm is already O(n*log(n)), as every step is O(n) apart the sorting which is O(n*log(n)). If it can be significantly improved depends on which kind of input you expect. Edit: Unless, and that appears to be the case, it is not part of the requirement having the values sorted (in any case by value, not by number of occurrences) at the end of the process, in which case do not miss Oli Charlesworth's answer.

There are 2 concept on the ground: the first is how many samples are you going to get (n); the second is "how concentrated" are their values, how narrow or wide is the range where these values can be distributed (w = MAX_VALUE - MIN_VALUE).

If n is smaller than w (so your values are sparse), than your approach is already optimal and has little space for improvement.

But if w is small and n is big, you have much to gain with the following method.

Let's say you know you cannot get any value less than MIN_VALUE, and no value more than MAX_VALUE. Then, you can use value as an index for an array where you collect your frequencies. In this way, you skip the sorting step (O(n*log(n)) ), and you compute your frequencies in O(n).

int buffer_frequencies[MAX_VALUE - MIN_VALUE + 1];

//Now reset the array with some convenient function like memset

int* value_frequencies = buffer_frequencies;
value_frequencies -= MIN_VALUE; //Shift the beginning of the array, so that 
                                //you can use the value directly as the array index
//You are allowed to use negative indexes
for(v_i=0; v_i < n; v_i++) {
  value_frequencies[v[v_i]]++;
  }

Or even (possibly slight faster version of the for cycle, but usually a good compiler will already convert it in the most efficient version):

int* p_v = v;
int* end_p_v = v+n;
for(; p_v < end_p_v; p_v++) {
  value_frequencies[*p_v]++;
  }

Be careful that this method (both versions) is very delicate to the input values, i.e. you will break memory boundaries if you get a value beyond MIN_VALUE or MAX_VALUE

Then the second part of the algorithm:

//First cycle could be optimized, but it has no impact
int i = MIN_VALUE;
max_freq = value_frequencies[i];
max_freq_val = i;
min_freq = value_frequencies[i];
min_freq_val = i;
for(; i<MAX_VALUE; i++) {
    max_freq_val = (value_frequencies[i] > max_freq) ? i : max_freq_val;
    max_freq = (value_frequencies[i] > max_freq) ? value_frequencies[i] : max_freq;
    min_freq_val = (value_frequencies[i] < min_freq) ? i : min_freq_val;
    min_freq = (value_frequencies[i] < min_freq) ? value_frequencies[i] : min_freq;
    }
}

这篇关于查找最大/最小值出现在整数数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆