使用二进制搜索和Trie的复杂性 [英] Complexity in using Binary search and Trie

查看:117
本文介绍了使用二进制搜索和Trie的复杂性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在文件中给出了大量按字母排序的词列表,我需要编写一个程序,给定一个单词x,确定x是否在列表中。预处理是确定的,因为我将通过不同的输入多次调用此函数。

priorties:1.速度。 2.记忆



我已经知道我可以使用(n是单词的数量,m是单词的平均长度)
1.一个特里,时间是O(log(n)),空格(最佳情况)为O(log(n,m)),空格(最坏情况)为O(n m)。

2将完整列表加载到内存中,然后进行二进制搜索,时间为O(log(n)),空格为O(n * m)



我不确定关于tri的复杂性,如果错误,请纠正我。还有其他好的方法吗?

解决方案

这是tri(m)的时间,直到O(m log(n))进行二进制搜索。对于任何合理的方法,空间是渐近的(n m),您可以在某些情况下使用压缩减少。在理论上,特里结构在内存上有些更好,但实际上它具有隐藏实现细节的恶魔:存储指针需要的内存和潜在的缓存访问可能性差。



实现集合结构还有其他选项 - 在大多数语言中,hashset和treeset都是简单的选择。我会去哈希集,因为它是高效和简单的。


given a large list of alphabetically sorted words in a file,I need to write a program that, given a word x, determines if x is in the list. Preprocessing is ok since I will be calling this function many times over different inputs.
priorties: 1. speed. 2. memory

I already know I can use (n is number of words, m is average length of the words) 1. a trie, time is O(log(n)), space(best case) is O(log(nm)), space(worst case) is O(nm).
2. load the complete list into memory, then binary search, time is O(log(n)), space is O(n*m)

I'm not sure about the complexity on tri, please correct me if they are wrong. Also are there other good approaches?

解决方案

It is O(m) time for the trie, and up to O(mlog(n)) for the binary search. The space is asymptotically O(nm) for any reasonable method, which you can probably reduce in some cases using compression. The trie structure is, in theory, somewhat better on memory, but in practice it has devils hiding in the implementation details: memory needed to store pointers and potentially bad cache access.

There are other options for implementing a set structure - hashset and treeset are easy choices in most languages. I'd go for the hash set as it is efficient and simple.

这篇关于使用二进制搜索和Trie的复杂性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆