字符串是否缓存? [英] Are strings cached?
问题描述
但是
<预><代码>>>>c = "!@#$">>>d = "!@#$">>>id(c) == id(d)错误的>>>id(a) == id(b)真的为什么只在分配字符串时得到相同的 id() 结果?
已我仅用字符串"替换了ascii 字符串".感谢反馈
这与 ASCII 与非 ASCII 无关(您的非 ASCII"仍然是 ASCII,它只是标点符号,而不是字母数字).CPython,作为一个实现细节,实习生字符串常量只包含名称字符".在这种情况下,名称字符"与正则表达式转义 \w
的含义相同:字母数字加下划线.
注意:这可以随时更改,不应依赖,这只是他们碰巧使用的优化.
据猜测,这个选择是为了优化使用 getattr
和 setattr
的代码,dict
由少数字符串文字键控等,其中实习意味着涉及的字典查找通常最终会进行指针比较并完全避免比较字符串(当两个字符串都被实习时,它们在定义上要么是同一个对象,要么不相等,因此您可以避免阅读他们的数据完全).
>>> a = "zzzzqqqqasdfasdf1234"
>>> b = "zzzzqqqqasdfasdf1234"
>>> id(a)
4402117560
>>> id(b)
4402117560
but
>>> c = "!@#$"
>>> d = "!@#$"
>>> id(c) == id(d)
False
>>> id(a) == id(b)
True
Why get same id() result only when assign string?
Edited: I replace "ascii string" with just "string". Thanks for feedback
It's not about ASCII vs. non-ASCII (your "non-ASCII" is still ASCII, it's just punctuation, not alphanumeric). CPython, as an implementation detail, interns string constants that contain only "name characters". "Name characters" in this case means the same thing as the regex escape \w
: Alphanumeric, plus underscore.
Note: This can change at any time, and should never be relied on, it's just an optimization they happen to use.
At a guess, this choice was made to optimize code that uses getattr
and setattr
, dict
s keyed by a handful of string literals, etc., where interning means that the dictionary lookups involved often ends up doing pointer comparisons and avoiding comparing the strings at all (when two strings are both interned, they are definitionally either the same object, or not equal, so you can avoid reading their data entirely).
这篇关于字符串是否缓存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!