为什么字典键必须是不可变的? [英] Why must dictionary keys be immutable?

查看:411
本文介绍了为什么字典键必须是不可变的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么字典键必须是不可变的?我正在寻找一个简单明了的原因,为什么Python词典中的键具有该限制.

Why is it necessary for dictionary keys to be immutable? I'm looking for a simple, clear reason why keys in Python dictionaries have that restriction.

推荐答案

在我的计算机上,有一个文件/etc/dictionaries-common/words包含大量英语单词:

On my computer, there's a file /etc/dictionaries-common/words containing a large collection of English words:

>>> with open("/etc/dictionaries-common/words") as f:
...     words = [line.strip() for line in f]
... 
>>> "python" in words
True
>>> "BDFL" in words
False

我们创建一个字典来存储所有这些单词的长度:

Let's create a dictionary storing the lengths of all those words:

>>> word_lengths = {w: len(w) for w in words}
>>> word_lengths["parrot"]
6

而且,为了踢球,我们将改组原始单词列表:

And, just for kicks, we'll shuffle our original word list:

>>> from random import shuffle
>>> shuffle(words)
>>> words[:5]
["Willie's", 'Araceli', 'accessed', 'engagingly', 'hobnobs']

嗯, hobnobs .无论如何...现在我们已经弄乱了words,我们变得有点偏执了(可能出于与渴望滚刀相同的原因),并且我们想检查所有单词是否混合后,我们的word_lengths词典仍在words中:

Mmm, hobnobs. Anyway ... now that we've messed around a bit with words, we've become a little bit paranoid (possibly for the same reason that we're craving hobnobs), and we want to check that all the words in our word_lengths dictionary are still in words after we mixed it all up:

>>> all(w in words for w in word_lengths)
True

好吧,我们到了那里,但是在我的机器上花了三分钟多的时间-至少有足够的时间吃几对美味的饼干.考虑一下,很明显为什么:我们有...

Well, we got there, but on my machine that took over three minutes - enough time to eat a couple more delicious biscuits, at least. Thinking about it, it's obvious why: we've got ...

>>> len(words)
99171

...要检查的单词将近十万个,对于字典中的每个单词,Python都必须搜索我们混合的单词列表,直到找到匹配的单词.不一定总是要检查整个列表,但平均每次平均要写五万个单词(或列表的一半),总共50,000×100,000 = 5,000,000,000次测试.即使在这个奇迹般的技术时代,也有50亿.

... nearly a hundred thousand words to check, and for each one in the dictionary, Python has to search through our mixed-up list of words until it finds a match. It won't always have to check the whole list, but on average that's going to be fifty thousand words (or half the list) each time, for a total of 50,000 × 100,000 = 5,000,000,000 tests. Five billion is a lot, even in this miraculous age of technology.

请务必确定(我通常不那么偏执;通常我只是困倦),让我们检查一下其他方法,并确保words中的所有内容仍然在word_lengths中:

Just to be absolutely sure (I'm not normally so paranoid; normally I just get sleepy), let's check the other way around, and make sure that everything in words is still in word_lengths:

>>> all(w in word_lengths for w in words)
True

嘿,什么?这次大约是十分之一秒!是什么赋予了?你吓坏我了,伙计……嘿,我的饼干在哪里?我很确定.

Hey, what? This time it was, like, a tenth of a second! What gives? You're freaking me out, man ... and hey, where are my biscuits? I had them just now, I'm sure of it.

与列表不同,列表可以按任何旧顺序排列(因此,确保其中有某个项目意味着要依次检查每个项目,直到找到它为止),字典的效率更高.参加聚会的乐趣可能会减少,但是,嘿,让它负责音乐,一切都结束了,知道吗?

Unlike a list, which can be in any old order (so making sure that some item is in there means checking each item in turn until we find it), a dictionary is a bit more efficient. Probably less fun at parties, but hey, leave it in charge of the music and all is copacetic, y'know?

字典的无情效率的秘诀在于,对于每个项目,字典都会根据其内容计算密钥的哈希值(实际上是整数),并使用该哈希值将项目存储在内存中的特定位置.然后,当您寻找项目时,它会再次计算密钥内容的哈希值,对自己说:好吧,"python",哈希为7036520087640895475 ...是的,我知道我必须把它放在哪里,然后",然后直接转到正确的存储位置以找到它.因此,这一次,它只需要执行十万次检查,而不是五十亿次.

The secret of dictionaries' ruthless efficiency is that for each item, the dictionary calculates a hash (just an integer, really) of the key based on its content, and uses that to store the item at a specific location in memory. Then, when you go looking for the item, it calculates the hash of the key's content again, says to itself "okay, "python", that hashes to 7036520087640895475 ... yeah, I know where I must have put that, then", and goes straight to the right memory location to find it. So this time, it only had to do a hundred thousand checks rather than five billion.

这有点像是将所有CD整齐地按字母顺序放在架子上,而不是将它们随机从盒中堆放到扬声器顶部.我告诉你,字典知道它在哪里.

It's kinda like having all your CDs neatly alphabetised on shelves, rather than stacked randomly out of their cases on top of your speakers. Dictionaries know where it's at, I'm telling you.

但是要付出代价,词典才能将其保持在一起.还记得我说字典根据项目的内容计算哈希值吗?好吧,如果该内容发生更改会怎样?对于不成问题的对象,这不是问题-它们的内容不能更改-但是可变对象,按照定义,可以更改其内容,并且当它们更改时,其哈希值(如果他们甚至有一个)也会改变.显然,这很酷,并不是每个人都希望将其放入盒子中,我明白了,但是如果哈希值发生了变化,字典就无法计算出它放在东西中的位置.

But there's a price to pay for dictionaries' ability to keep it together. Remember when I said that the dictionary calculates a hash based on the item's content? Well, what happens if that content changes? For immutable objects that's not a problem - their content can't change - but mutable objects, by definition, can change their contents, and when they do, their hash (if they even have one) will change too. Which is cool, obviously, not everyone wants to be put in a box, I get that, but if the hash has changed, there's no way for the dictionary to work out where it put the thing.

就好像Joy Division将其名称更改为"New Order"一样,现在您不知道将"Blue Monday"的12英寸混音放到哪里.这根本行不通.

It's as though Joy Division changed their name to New Order, and now you've got no idea where you put that 12" remix of Blue Monday. It's just not gonna work.

因此,词典有一个规则:如果您想成为键,请不要更改.

So, dictionaries have a rule: if you want to be a key, don't go changing.

这篇关于为什么字典键必须是不可变的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆