sys.getsizeof() 结果与结构大小不太相关 [英] sys.getsizeof() results don't quite correlate to structure size

查看:13
本文介绍了sys.getsizeof() 结果与结构大小不太相关的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个大小为 1 MB 的列表.虽然以下代码有效:

I am trying to create a list of size 1 MB. while the following code works:

dummy = ['a' for i in xrange(0, 1024)]
sys.getsizeof(dummy)
Out[1]: 9032

以下代码不起作用.

import os
import sys

dummy = []
dummy.append((os.urandom(1024))
sys.getsizeof(dummy)
Out[1]: 104

有人能解释一下原因吗?

Can someone explain why?

如果您想知道为什么我不使用第一个代码片段,我正在编写一个程序来对我的内存进行基准测试,方法是编写一个将块(大小为 1 B、1 KB 和 1 MB)写入内存的 for 循环.

If you're wondering why I am not using the first code snippet, I am writing a program to benchmark my memory by writing a for loop that writes blocks (of size 1 B, 1 KB and 1 MB) into memory.

start = time.time() 
for i in xrange(1, (1024 * 10)):  
     dummy.append(os.urandom(1024)) #loop to write 1 MB blocks into memory
end = time.time()

推荐答案

如果您检查列表的大小,它将提供列表数据结构的大小,包括指向其组成元素的指针.它不会考虑元素的大小.

If you check the size of a list, it will be provide the size of the list data structure, including the pointers to its constituent elements. It won't consider the size of elements.

str1_size = sys.getsizeof(['a' for i in xrange(0, 1024)])
str2_size = sys.getsizeof(['abc' for i in xrange(0, 1024)])
int_size = sys.getsizeof([123 for i in xrange(0, 1024)])
none_size = sys.getsizeof([None for i in xrange(0, 1024)])
str1_size == str2_size == int_size == none_size

空列表的大小:sys.getsizeof([]) == 72
添加一个元素:sys.getsizeof([1]) == 80
添加另一个元素:sys.getsizeof([1, 1]) == 88
所以每个元素增加 4 个字节.
要获得 1024 个字节,我们需要 (1024 - 72)/8 = 119 个元素.

The size of empty list: sys.getsizeof([]) == 72
Add an element: sys.getsizeof([1]) == 80
Add another element: sys.getsizeof([1, 1]) == 88
So each element adds 4 bytes.
To get 1024 bytes, we need (1024 - 72) / 8 = 119 elements.

包含 119 个元素的列表的大小:sys.getsizeof([None for i in xrange(0, 119)]) == 1080.
这是因为列表维护了一个额外的缓冲区来插入更多的项目,所以它不必每次都调整大小.(对于 107 到 126 之间的元素数量,大小与 1080 相同).

The size of the list with 119 elements: sys.getsizeof([None for i in xrange(0, 119)]) == 1080.
This is because a list maintains an extra buffer for inserting more items, so that it doesn't have to resize every time. (The size comes out to be same as 1080 for number of elements between 107 and 126).

所以我们需要的是一个不可变的数据结构,它不需要保留这个缓冲区——tuple.

So what we need is an immutable data structure, which doesn't need to keep this buffer - tuple.

empty_tuple_size = sys.getsizeof(())                     # 56
single_element_size = sys.getsizeof((1,))                # 64
pointer_size = single_element_size - empty_tuple_size    # 8
n_1mb = (1024 - empty_tuple_size) / pointer_size         # (1024 - 56) / 8 = 121
tuple_1mb = (1,) * n_1mb
sys.getsizeof(tuple_1mb) == 1024

所以这是获得 1MB 数据结构的答案:(1,)*121

So this is your answer to get a 1MB data structure: (1,)*121

但请注意,这只是元组和组成指针的大小.对于总大小,您实际上需要将各个元素的大小相加.

But note that this is only the size of tuple and the constituent pointers. For the total size, you actually need to add up the size of individual elements.

替代:

sys.getsizeof('') == 37
sys.getsizeof('1') == 38     # each character adds 1 byte

对于 1 MB,我们需要 987 个字符:

For 1 MB, we need 987 characters:

sys.getsizeof('1'*987) == 1024

这是实际大小,而不仅仅是指针的大小.

And this is the actual size, not just the size of pointers.

这篇关于sys.getsizeof() 结果与结构大小不太相关的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆