python 中的 len() 和 sys.getsizeof() 方法有什么区别? [英] What is the difference between len() and sys.getsizeof() methods in python?

查看:63
本文介绍了python 中的 len() 和 sys.getsizeof() 方法有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我运行下面的代码时,我分别得到了 3 和 36 作为答案.

x ="abd"打印 len(x)打印 sys.getsizeof(x)

谁能给我解释一下它们之间有什么区别?

解决方案

它们完全不同.

len() 查询一个容器中包含的项目数.对于一个字符数的字符串:

<块引用>

返回对象的长度(项目数).参数可以是序列(字符串、元组或列表)或映射(字典).

sys.getsizeof() 另一方面返回对象的内存大小:

<块引用>

以字节为单位返回对象的大小.对象可以是任何类型的对象.所有内置对象都将返回正确的结果,但对于第三方扩展,这不一定适用,因为它是特定于实现的.

Python 字符串对象不是简单的字符序列,每个字符 1 个字节.

具体来说,sys.getsizeof() 函数包括垃圾收集器开销(如果有):

<块引用>

getsizeof() 调用对象的 __sizeof__ 方法,并在对象由垃圾收集器管理时增加额外的垃圾收集器开销.

字符串对象不需要被跟踪(它们不能创建循环引用),但字符串对象确实需要更多的内存,而不仅仅是每个字符的字节数.在 Python 2 中,__sizeof__ 方法返回(在 C 代码中):

Py_ssize_t res;res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;返回 PyInt_FromSsize_t(res);

其中 PyStringObject_SIZE 是类型的 C 结构头大小,PyString_GET_SIZE 基本上与 len()Py_TYPE 相同(v)->tp_itemsize 是每个字符的大小.在 Python 2.7 中,对于字节字符串,每个字符的大小为 1,但让您感到困惑的是 PyStringObject_SIZE;在我的 Mac 上,大小为 37 字节:

<预><代码>>>>sys.getsizeof('')37

对于 unicode 字符串,每个字符的大小最多为 2 或 4(取决于编译选项).在 Python 3.3 和更新版本上,Unicode 字符串每个字符占用 1 到 4 个字节,具体取决于字符串的内容.

When I ran the below code I got 3 and 36 as the answers respectively.

x ="abd"
print len(x)
print sys.getsizeof(x)

Can someone explain to me what's the difference between them ?

解决方案

They are not the same thing at all.

len() queries for the number of items contained in a container. For a string that's the number of characters:

Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).

sys.getsizeof() on the other hand returns the memory size of the object:

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

Python string objects are not simple sequences of characters, 1 byte per character.

Specifically, the sys.getsizeof() function includes the garbage collector overhead if any:

getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python 2, __sizeof__ method returns (in C code):

Py_ssize_t res;
res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;
return PyInt_FromSsize_t(res);

where PyStringObject_SIZE is the C struct header size for the type, PyString_GET_SIZE basically is the same as len() and Py_TYPE(v)->tp_itemsize is the per-character size. In Python 2.7, for byte strings, the size per character is 1, but it's PyStringObject_SIZE that is confusing you; on my Mac that size is 37 bytes:

>>> sys.getsizeof('')
37

For unicode strings the per-character size goes up to 2 or 4 (depending on compilation options). On Python 3.3 and newer, Unicode strings take up between 1 and 4 bytes per character, depending on the contents of the string.

这篇关于python 中的 len() 和 sys.getsizeof() 方法有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆