计算大数组中的唯一元素 [英] Counting unique element in large array
问题描述
在采访中,我的一位同事在提问后被问到。
One of my colleague was asked following question in an interview.
给出一个存储未签名int的巨大数组。数组的长度为100000000。找到计算数组中元素的唯一数量的有效方法。
Eg arr = {2,34,5,6,7 ,2,2,5,1,34,5}
O / p:2的计数为3,34的计数为2,依此类推。
Given a huge array which stores unsigned int. Length of array is 100000000. Find the effective way to count the unique number of elements present in the array.
E.g arr = {2,34,5,6,7,2,2,5,1,34,5}
O/p: Count of 2 is 3, Count of 34 is 2 and so on.
有效的算法是什么?我以为首先字典/哈希是一种选择,但是由于数组很大,因此效率很低。
What is the effective algorithms to do this? I thought at first dictionary/hash would be one of the option, but since array is very large it is in-efficient. Is there any way to do this?
谢谢,
chota
Thanks, chota
推荐答案
堆排序为O(nlogn)并就地进行。处理大型数据集时,就地是必要的。排序后,您可以遍历数组,计算每个值的出现次数。因为数组是排序的,所以一旦值发生变化,您就会知道所有先前值的出现。
Heap sort is O(nlogn) and in-place. In-place is necessary when dealing with large data sets. Once sorted you can make one pass through the array tallying occurrences of each value. Because the array is sorted, once a value changes you know you've seen all occurrences of the previous value.
这篇关于计算大数组中的唯一元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!