计算大数组中的唯一元素 [英] Counting unique element in large array

查看:102
本文介绍了计算大数组中的唯一元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在采访中,我的一位同事在提问后被问到。

One of my colleague was asked following question in an interview.

给出一个存储未签名int的巨大数组。数组的长度为100000000。找到计算数组中元素的唯一数量的有效方法。


Eg arr = {2,34,5,6,7 ,2,2,5,1,34,5}
O / p:2的计数为3,34的计数为2,依此类推。

Given a huge array which stores unsigned int. Length of array is 100000000. Find the effective way to count the unique number of elements present in the array.
E.g arr = {2,34,5,6,7,2,2,5,1,34,5} O/p: Count of 2 is 3, Count of 34 is 2 and so on.

有效的算法是什么?我以为首先字典/哈希是一种选择,但是由于数组很大,因此效率很低。

What is the effective algorithms to do this? I thought at first dictionary/hash would be one of the option, but since array is very large it is in-efficient. Is there any way to do this?

谢谢,
chota

Thanks, chota

推荐答案

堆排序为O(nlogn)并就地进行。处理大型数据集时,就地是必要的。排序后,您可以遍历数组,计算每个值的出现次数。因为数组是排序的,所以一旦值发生变化,您就会知道所有先前值的出现。

Heap sort is O(nlogn) and in-place. In-place is necessary when dealing with large data sets. Once sorted you can make one pass through the array tallying occurrences of each value. Because the array is sorted, once a value changes you know you've seen all occurrences of the previous value.

这篇关于计算大数组中的唯一元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆