不需要字典的密钥 - 哈希表? [英] don't need dictionary's keys - hash table?

查看:111
本文介绍了不需要字典的密钥 - 哈希表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,

我正在使用一些非常大的词典,其长键是

(urls)。对于一个大字典,这些键开始占用大量的内存。我不需要访问这些键 - 我

只需要能够检索与某个

键相关的值,所以我不想拥有密钥存储在内存中。我能不能首先使用
hash()url字符串并使用生成的整数作为键?

我认为我在这之后更像是传统哈希表。如果我这样做b $ b这样做,我会得到我追求的记忆力吗?

哈希函数总是生成唯一键吗?另外,同样的

技术是否适用于一套?


我们非常感谢任何其他想法或考虑因素。


谢谢。

Hello,
I am using some very large dictionaries with keys that are long strings
(urls). For a large dictionary these keys start to take up a
significant amount of memory. I do not need access to these keys -- I
only need to be able to retrieve the value associated with a certain
key, so I do not want to have the keys stored in memory. Could I just
hash() the url strings first and use the resulting integer as the key?
I think what I''m after here is more like a tradition hash table. If I
do it this way am I going to get the memory savings I am after? Will
the hash function always generate unique keys? Also, would the same
technique work for a set?

Any other thoughts or considerations are appreciated.

Thank You.

推荐答案

kdot ... @ gmail.com写道:
kdot...@gmail.com wrote:

您好,

我正在使用一些非常大的字典,其长键是长串

(网址)。对于一个大字典,这些键开始占用大量的内存。我不需要访问这些键 - 我

只需要能够检索与某个

键相关的值,所以我不想拥有密钥存储在内存中。我能不能首先使用
hash()url字符串并使用生成的整数作为键?

我认为我在这之后更像是传统哈希表。如果我这样做b $ b这样做,我会得到我追求的记忆力吗?

哈希函数总是生成唯一键吗?另外,同样的

技术是否适用于一套?
Hello,
I am using some very large dictionaries with keys that are long strings
(urls). For a large dictionary these keys start to take up a
significant amount of memory. I do not need access to these keys -- I
only need to be able to retrieve the value associated with a certain
key, so I do not want to have the keys stored in memory. Could I just
hash() the url strings first and use the resulting integer as the key?
I think what I''m after here is more like a tradition hash table. If I
do it this way am I going to get the memory savings I am after? Will
the hash function always generate unique keys? Also, would the same
technique work for a set?



我刚才意识到哈希并不总是

唯一,所以这不会真的有用。并且看起来哈希表还是需要存储键(作为字符串),以便在发生冲突时可以完成字符串比较

。我猜有没有避免

存储它们的钥匙?

I just realized that of course the hash is not always going to be
unique, so this wouldn''t really work. And it seems a hash table would
still need to store the keys (as strings) so that string comparisons
can be done when a collision occurs. I guess there''s no avoiding
storing they keys?


kd ***** @ gmail.com 写道:
kd*****@gmail.com wrote:

哈希函数是否始终生成唯一键?
Will the hash function always generate unique keys?



no。 hash()是为字典(哈希表)设计的,不能用作

加密哈希值。


取决于你的应用程序,一个布隆过滤器可能是足够好:

http://en.wikipedia。 org / wiki / Bloom_filter


(参见Python实现的链接部分)


< / F>

no. hash() is designed for dictionaries (hash tables), not for use as a
cryptographic hash.

depending on your application, a bloom filter might be a good enough:

http://en.wikipedia.org/wiki/Bloom_filter

(see the links section for a Python implementation)

</F>


kd*****@gmail.com 写道:
kd*****@gmail.com wrote:

我刚才意识到哈希并不总是唯一的,因为这不是真的有效。并且看起来哈希表还是需要存储键(作为字符串),以便在发生冲突时可以完成字符串比较


I just realized that of course the hash is not always going to be
unique, so this wouldn''t really work. And it seems a hash table would
still need to store the keys (as strings) so that string comparisons
can be done when a collision occurs.



顺便说一句,Python的字典类型*是* b $ ba传统哈希表的高度优化实现。 />

< / F>

btw, Python''s dictionary type *is* a highly-optimized implementation of
a "traditional hash table".

</F>


这篇关于不需要字典的密钥 - 哈希表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆