HashSet of Strings占用了太多内存,建议......? [英] HashSet of Strings taking up too much memory, suggestions...?

查看:124
本文介绍了HashSet of Strings占用了太多内存,建议......?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在HashSet中存储一个单词列表(大约120,000个),目的是使用列表来检查被激活的单词,看它们是否拼写正确,只返回是或否。

I am currently storing a list of words (around 120,000) in a HashSet, for the purpose of using as a list to check enetered words against to see if they are spelt correctly, and just returning yes or no.

我想知道是否有办法做到这一点,占用更少的内存。目前120,000个单词约为12meg,单词读取的实际文件大约为900kb。

I was wondering if there is a way to do this which takes up less memory. Currently 120,000 words is around 12meg, the actual file the words are read from is around 900kb.

有任何建议吗?

提前致谢

推荐答案

查看布隆过滤器或布谷鸟哈希。 布隆过滤器或布谷鸟哈希?

Check out bloom filters or cuckoo hashing. Bloom filter or cuckoo hashing?

我不确定这是否是您问题的答案,但值得研究这些替代方案。布隆过滤器主要用于拼写检查器用例。

I am not sure if this is the answer for your question but worth looking into these alternatives. bloom filters are mainly used for spell checker kind of use cases.

这篇关于HashSet of Strings占用了太多内存,建议......?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆