空间有效的数据结构存储Word列表? [英] Space-Efficient Data Structure for Storing a Word List?
问题描述
- 存储约100k个英文单词的列表
- 需要使用最小记忆
- 查找需要合理,但不必快速闪电
我正在使用Java,所以我的第一个尝试只是使用Set< String> ;.不过,我的目标是移动设备,而且内存已经很低。由于很多英文单词都有共同的前缀,所以一个特殊效果似乎是一个体面的赌注,以节省一些记忆 - 任何人都知道一些其他好的选择。
编辑 - 更多信息 - 数据结构将用于两个操作
- 回答:列表中有一些单词XYZ
- 生成一个字母不同的XYZ附近的单词
感谢您的好建议
你在做什么?如果是拼写检查,您可以使用一个绽放过滤器 - 请参阅此代码kata 。
Is there anything better than a Trie for this situation?
- Storing a list of ~100k English words
- Needs to use minimal memory
- Lookups need to be reasonable, but don't have to be lightning fast
I'm working with Java, so my first attempt was to just use a Set<String>. However, I'm targeting a mobile device and already running low on memory. Since many English words share common prefixes, a trie seems like a decent bet to save some memory -- anyone know some other good options?
EDIT - More info - The data structure will be used for two operations
- Answering: Is some word XYZ in the list?
- Generating the neighborhood of words around XYZ with one letter different
Thanks for the good suggestions
What are you doing? If it's spell checking, you could use a bloom filter - see this code kata.
这篇关于空间有效的数据结构存储Word列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!