字符串是否缓存? [英] Are strings cached?

查看:27
本文介绍了字符串是否缓存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

<预><代码>>>>a = "zzzzqqqqasdfasdf1234">>>b = "zzzzqqqqasdfasdf1234">>>身份证(一)4402117560>>>身份证(二)4402117560

但是

<预><代码>>>>c = "!@#$">>>d = "!@#$">>>id(c) == id(d)错误的>>>id(a) == id(b)真的

为什么只在分配字符串时得到相同的 id() 结果?

已我仅用字符串"替换了ascii 字符串".感谢反馈

解决方案

这与 ASCII 与非 ASCII 无关(您的非 ASCII"仍然是 ASCII,它只是标点符号,而不是字母数字).CPython,作为一个实现细节,实习生字符串常量只包含名称字符".在这种情况下,名称字符"与正则表达式转义 \w 的含义相同:字母数字加下划线.

注意:这可以随时更改,不应依赖,这只是他们碰巧使用的优化.

据猜测,这个选择是为了优化使用 getattrsetattr 的代码,dict 由少数字符串文字键控等,其中实习意味着涉及的字典查找通常最终会进行指针比较并完全避免比较字符串(当两个字符串都被实习时,它们在定义上要么是同一个对象,要么不相等,因此您可以避免阅读他们的数据完全).

>>> a = "zzzzqqqqasdfasdf1234"
>>> b = "zzzzqqqqasdfasdf1234"
>>> id(a)
4402117560
>>> id(b)
4402117560

but

>>> c = "!@#$"
>>> d = "!@#$"
>>> id(c) == id(d)
False
>>> id(a) == id(b)
True

Why get same id() result only when assign string?

Edited: I replace "ascii string" with just "string". Thanks for feedback

解决方案

It's not about ASCII vs. non-ASCII (your "non-ASCII" is still ASCII, it's just punctuation, not alphanumeric). CPython, as an implementation detail, interns string constants that contain only "name characters". "Name characters" in this case means the same thing as the regex escape \w: Alphanumeric, plus underscore.

Note: This can change at any time, and should never be relied on, it's just an optimization they happen to use.

At a guess, this choice was made to optimize code that uses getattr and setattr, dicts keyed by a handful of string literals, etc., where interning means that the dictionary lookups involved often ends up doing pointer comparisons and avoiding comparing the strings at all (when two strings are both interned, they are definitionally either the same object, or not equal, so you can avoid reading their data entirely).

这篇关于字符串是否缓存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆