为什么以下两个重复的查找器算法具有不同的时间复杂度? [英] Why is the following two duplicate finder algorithms have different time complexity?

查看:212
本文介绍了为什么以下两个重复的查找器算法具有不同的时间复杂度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读此问题。所选答案包含以下两种算法。我不明白为什么第一个人的时间复杂度是O(ln(n))。在最坏的情况下,如果数组不包含任何重复的数组,那么它将循环次数为n次,第二次循环。我错了还是我错过了什么?谢谢



1)更快(限制)的方式



这是一个基于哈希的方法。你必须支付自动装箱,但是它是O(ln(n))而不是O(n2)。一个进取的灵魂会去寻找一个基本的基于int的哈希集(Apache或Google Collections有这样一个东西,嗯)。

  boolean duplicates(final int [] zipcodelist)
{
Set< Integer> lump = new HashSet< Integer>();
for(int i:zipcodelist)
{
if(lump.contains(i))return true;
lump.add(i);
}
返回false;
}

2)Bow to HuyLe



请参阅HuyLe的答案或多或少的O(n)解决方案,我认为需要几个附加步骤:

  static boolean duplicateates(final int [] zipcodelist){
final int MAXZIP = 99999;
boolean [] bitmap = new boolean [MAXZIP + 1];
java.util.Arrays.fill(bitmap,false);

(int item:zipcodeList)
if(!bitmap [item])bitmap [item] = true;
else return true;
}

return false;
}


解决方案

第一个解决方案应该有预期O(n)的复杂性,因为必须遍历整个邮政编码列表,并且处理每个邮政编码是O(1)预期的时间复杂度。



即使考虑该插入到HashMap可能会触发重新散列,复杂性仍然是 O(1)。这是一个非偶然的,因为Java HashMap和链接中的假设可能没有关系,但它是有可能表明它是可能的。



HashSet 文档:


此类为基本操作提供常数时间性能(添加,删除,



p>与第二个解决方案相同,这是正确的分析:O(n)。



(只是一个偏离主题的笔记,BitSet比数组更快,在原始帖子中看到,因为8 boolean 被打包成1 字节,它使用较少的内存) p>

I was reading this question. The selected answer contains the following two algorithms. I couldn't understand why the first one's time complexity is O(ln(n)). At the worst case, if the array don't contain any duplicates it will loop n times so does the second one. Am I wrong or am I missing something? Thank you

1) A faster (in the limit) way

Here's a hash based approach. You gotta pay for the autoboxing, but it's O(ln(n)) instead of O(n2). An enterprising soul would go find a primitive int-based hash set (Apache or Google Collections has such a thing, methinks.)

boolean duplicates(final int[] zipcodelist)
{
  Set<Integer> lump = new HashSet<Integer>();
  for (int i : zipcodelist)
  {
    if (lump.contains(i)) return true;
    lump.add(i);
  }
  return false;
}

2)Bow to HuyLe

See HuyLe's answer for a more or less O(n) solution, which I think needs a couple of add'l steps:

static boolean duplicates(final int[] zipcodelist) {    
    final int MAXZIP = 99999;    
    boolean[] bitmap = new boolean[MAXZIP+1];    
    java.util.Arrays.fill(bitmap, false);    

    for (int item : zipcodeList)
        if (!bitmap[item]) bitmap[item] = true;
        else return true;    
    }

    return false; 
}

解决方案

The first solution should have expected complexity of O(n), since the whole zip code list must be traversed, and processing each zip code is O(1) expected time complexity.

Even taking into consideration that insertion into HashMap may trigger a re-hash, the complexity is still O(1). This is a bit of non sequitur, since there may be no relation between Java HashMap and the assumption in the link, but it is there to show that it is possible.

From HashSet documentation:

This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets.

It's the same for the second solution, which is correctly analyzed: O(n).

(Just an off-topic note, BitSet is faster than array, as seen in the original post, since 8 booleans are packed into 1 byte, which uses less memory).

这篇关于为什么以下两个重复的查找器算法具有不同的时间复杂度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆