为什么自定义对象需要这么多内存? [英] Why custom objects take so much memory?

查看:56
本文介绍了为什么自定义对象需要这么多内存?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




经过几天的脚本调试后,我发现有些

假设我正在做的关于我的内存复杂性课程是

不正确。我决定用一个简单的脚本来隔离问题:


class MyClass:

def __init __(self,s):

self.mystring = s


mylist = []

for i in range(1024 * 1024):

mylist。追加(MyClass(s​​tr(i)))#allocation

#stage 1

mylist =无

gc.collect()

#stage 2


我在#stage1和

#stage 2中测量脚本的内存消耗量,然后我获得:

#stage1 -238MB

#stage2 -15MB

这意味着每个对象的大小约为223字节!考虑到它只包含一个最大尺寸为7

chars的字符串,那就太多了。


如果你改变另一个的分配线:

Hi,

after a couple of days of script debugging, I kind of found that some
assumptions I was doing about the memory complexity of my classes are
not true. I decided to do a simple script to isolate the problem:

class MyClass:
def __init__(self,s):
self.mystring = s

mylist = []
for i in range(1024*1024):
mylist.append(MyClass(str(i))) #allocation
#stage 1
mylist = None
gc.collect()
#stage 2

I take measures of the memory consumption of the script at #stage1 and
#stage 2 and I obtain:
#stage1 -238MB
#stage2 -15MB

That means every object is around 223 bytes in size!!!! That''s too
much considering it only contains a string with a maximum size of 7
chars.

If you change the allocation line for this other:


>> mylist.append(str(i))#we don''创建自定义类,但将字符串直接附加到列表中
>>mylist.append(str(i)) #we don''t create the custom class, but append the string directly into the list



数字大幅减少到:

#stage1 -47.6MB

#stage2 -15MB

(所以这次我们可以说字符串变量大约占用32个字节......还是一个

很多,不是吗?)


那么,幕后究竟发生了什么?为什么使用自定义

对象如此昂贵?还有什么其他的创建结构的方法可以使用
(内存使用更便宜)?


提前多多谢谢!

the numbers decrease substantially to:
#stage1 -47.6MB
#stage2 -15MB
(so this time we can say string vars occupy around 32 bytes....still a
lot, isn''t it?)

So, what''s exactly going on behind the scenes? Why is using custom
objects SO expensive? What other ways of creating structures can be
used (cheaper in memory usage)?

Thanks a lot in advance!

推荐答案

2007年12月18日下午1:26,jsanshef< js ******* @ gmail.comwrote:
On Dec 18, 2007 1:26 PM, jsanshef <js*******@gmail.comwrote:




经过几天的脚本调试后,我发现有一些

假设我在做什么关于我的类的内存复杂性是

不正确。我决定用一个简单的脚本来隔离问题:


class MyClass:

def __init __(self,s):

self.mystring = s


mylist = []

for i in range(1024 * 1024):

mylist。追加(MyClass(s​​tr(i)))#allocation

#stage 1

mylist =无

gc.collect()

#stage 2


我在#stage1和

#stage 2中测量脚本的内存消耗量,然后我获得:

#stage1 -238MB

#stage2 -15MB

这意味着每个对象的大小约为223字节!考虑到它只包含一个最大尺寸为7

chars的字符串。
Hi,

after a couple of days of script debugging, I kind of found that some
assumptions I was doing about the memory complexity of my classes are
not true. I decided to do a simple script to isolate the problem:

class MyClass:
def __init__(self,s):
self.mystring = s

mylist = []
for i in range(1024*1024):
mylist.append(MyClass(str(i))) #allocation
#stage 1
mylist = None
gc.collect()
#stage 2

I take measures of the memory consumption of the script at #stage1 and
#stage 2 and I obtain:
#stage1 -238MB
#stage2 -15MB

That means every object is around 223 bytes in size!!!! That''s too
much considering it only contains a string with a maximum size of 7
chars.



类是相当重量级的 - 在你的情况下你有大小的

PyObject结构,一个类属性的字典(它本身

是另一个pyobject) ,和字符串对象(还有另一个pyobject),

和实际的字符串数据。

Classes are fairly heavyweight - in your case you''ve got the size of
the PyObject struct, a dictionary for class attributes (which itself
is another pyobject), and the string object (yet another pyobject),
and the actual string data.


如果更改其他的分配行:
If you change the allocation line for this other:

> mylist.append(str(i))#we不创建自定义类,但附加了字符串直接进入列表
>mylist.append(str(i)) #we don''t create the custom class, but append the string directly into the list



数字大幅减少到:

#stage1 -47.6MB

#stage2 -15MB

(所以这次我们可以说字符串变量占用大约32个字节....还是一个

很多,不是吗?)


the numbers decrease substantially to:
#stage1 -47.6MB
#stage2 -15MB
(so this time we can say string vars occupy around 32 bytes....still a
lot, isn''t it?)



string obje cts没有字典,小于常规

python对象。

string objects don''t have dictionaries and are smaller than "regular"
python objects.


那么,究竟是什么在幕后?为什么使用自定义

对象如此昂贵?还有什么其他创建结构的方法可以使用(使用内存更便宜)?
So, what''s exactly going on behind the scenes? Why is using custom
objects SO expensive? What other ways of creating structures can be
used (cheaper in memory usage)?



如果你担心每个实例内存成本Python可能不是用于您目的的语言。另一方面,可能实际上你不需要担心这么多。


你可以减少新式课程的大小如果你使用__slots__来消除类字典,那么(从对象继承)


If you''re worried about per-instance memory costs Python is probably
not the language for your purposes. On the other hand, odds are that
you actually don''t need to worry so much.

You can reduce the size of new-style classes (inherit from object) by
quite a bit if you use __slots__ to eliminate the class dictionary.


jsanshef< js **** ***@gmail.comwrites:
jsanshef <js*******@gmail.comwrites:

这意味着每个对象的大小约为223字节!考虑到它只包含一个最大大小为7

chars的字符串,那就太多了。
That means every object is around 223 bytes in size!!!! That''s too
much considering it only contains a string with a maximum size of 7
chars.



列表本身消耗4 MB,因为它存储了100万个PyObject

指针。由于过度分配可能会消耗更多,但是让我们忽略它。


每个对象本身需要36个字节:4个字节refcount + 4个字节类型ptr

+ 4字节dict ptr + 4字节weakptr + 12字节gc开销。那是'
不计算malloc开销,这应该是低的,因为对象不是单独的
malloced。每个对象都需要一个dict,它消耗额外的52个字节的内存(dict结构为40个字节,gc为12个/ b $ b)。这是'每个对象88个字节,不包括malloc开销。


然后有字符串分配:你的平均字符串是6个字符长;

为终止零添加一个额外的char。字符串

struct占用20个字节+字符串长度,四舍五入到最近的

对齐。对于你的平均情况,那是'27个字节,舍入(我假设)到28.

你还分配1024 * 1024个从不释放的整数(它们是

保存在一个免费列表中,每个占用至少12个字节。


所有这些每个对象最多可添加128个字节,分散在几个上面

不同的对象类型。我正在吃Python并不会让我感到惊讶

200+ MB内存。

The list itself consumes 4 MB because it stores 1 million PyObject
pointers. It possibly consumes more due to overallocation, but let''s
ignore that.

Each object takes 36 bytes itself: 4 bytes refcount + 4 bytes type ptr
+ 4 bytes dict ptr + 4 bytes weakptr + 12 bytes gc overhead. That''s
not counting malloc overhead, which should be low since objects aren''t
malloced individually. Each object requires a dict, which consumes
additional 52 bytes of memory (40 bytes for the dict struct plus 12
for gc). That''s 88 bytes per object, not counting malloc overhead.

Then there''s string allocation: your average string is 6 chars long;
add to that one additional char for the terminating zero. The string
struct takes up 20 bytes + string length, rounded to nearest
alignment. For your average case, that''s 27 bytes, rounded (I assume) to 28.
You also allocate 1024*1024 integers which are never freed (they''re
kept on a free list), and each of which takes up at least 12 bytes.

All that adds up to 128 bytes per object, dispersed over several
different object types. It doesn''t surprise me that Python is eating
200+ MB of memory.


那么,究竟是怎么回事在幕后?为什么使用自定义

对象如此昂贵?还有什么其他的创建结构的方法可以使用
(内存使用更便宜)?


提前多多谢谢!
So, what''s exactly going on behind the scenes? Why is using custom
objects SO expensive? What other ways of creating structures can be
used (cheaper in memory usage)?

Thanks a lot in advance!



使用新式类并设置__slots__:


类MyClass(对象):

__slots__ =''mystring'',

def __init __(self,s):

self.mystring = s


通过减少对象实例的大小和删除字典,将内存消耗降低到大约80MB。

Use a new-style class and set __slots__:

class MyClass(object):
__slots__ = ''mystring'',
def __init__(self, s):
self.mystring = s

That brings down memory consumption to ~80MB, by cutting down the size
of object instance and removing the dict.


在文章<马************* @蟒蛇。 org>,

Chris Mellon< ar ***** @ gmail.comwrote:
In article <ma***************************************@python. org>,
Chris Mellon <ar*****@gmail.comwrote:

>
你可以如果你使用__slots__来消除类字典,那么通过
来减少新式类的大小(从对象继承)。
>
You can reduce the size of new-style classes (inherit from object) by
quite a bit if you use __slots__ to eliminate the class dictionary.



你也可以使用__slots__来减少你的功能。

有一天我会有时间写一个合适的页面为什么你不应该使用__slots __ $

-

Aahz(aa**@pythoncraft.com)< ; * http://www.pythoncraft.com/

打字很便宜。思考很昂贵。 - 罗伊史密斯

You can also reduce your functionality quite a bit by using __slots__.
Someday I''ll have time to write up a proper page about why you shouldn''t
use __slots__....
--
Aahz (aa**@pythoncraft.com) <* http://www.pythoncraft.com/

"Typing is cheap. Thinking is expensive." --Roy Smith


这篇关于为什么自定义对象需要这么多内存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆