numpy数组和列表中元素的大小不同 [英] different size of element in numpy array and list

查看:277
本文介绍了numpy数组和列表中元素的大小不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Win 7上使用32位Python 3.4.

I am using Python 3.4 32 bits on win 7.

我发现numpy数组中的整数有4个字节,但在列表中却有10个字节.

I found that an integer in an numpy array has 4 bytes, but in a list it has 10 bytes.

import numpy as np 
s = 10; 
lt = [None] * s;
cnt = 0 ; 
for i in range(0, s):
    lt[cnt] = i;
    cnt += 1;
lt = [x for x in lt if x is not None];
a = np.array(lt);
print("len(a) is " + str(len(a)) + " size is " + str(sys.getsizeof(a)) \
          + " bytes " + " a.itemsize is " + str(a.itemsize) + " total size is " \
          + str(a.itemsize * len(a))  + " Bytes , len(lt) is " \
          + str(len(lt)) + " size is " + str(sys.getsizeof(lt)) + " Bytes ");  

   len(a) is 10 size is 40 bytes  a.itemsize is 4 total size is 40 Bytes , len(lt) is 10 size is 100 Bytes the fist element has 12 Bytes

因为在列表中,每个元素都必须保留一个指向下一个元素的指针?

Because in a list, each element has to keep a pointer to point to the next element ?

如果我为列表分配了一个字符串:

If I assigned a string to the list:

  lt[cnt] = "A";

  len(a) is 10 size is 40 bytes  a.itemsize is 4 total size is 40 Bytes , len(lt) is 10 size is 100 Bytes the fist element has 30 Bytes

因此,在数组中,每个元素有4个字节,在列表中,有30个字节.

So, in array, each element has 4 bytes and in list, it is 30 bytes.

但是,如果我尝试过:

    lt[cnt] = "AB";
    len(a) is 10 size is 40 bytes  a.itemsize is 8 total size is 80 Bytes , len(lt) is 10 size is 100 Bytes the fist element has 33 Bytes

在数组中,每个元素有8个字节,但是在列表中,它是33个字节.

In array, each element has 8 bytes but in list, it is 33 bytes.

如果我尝试过:

  lt[cnt] = "csedvserb revrvrrw gvrgrwgervwe grujy oliulfv qdqdqafwg5u u56i78k8 awdwfw";  # 73 characters long

 len(a) is 10 size is 40 bytes  a.itemsize is 292 total size is 2920 Bytes , len(lt) is 10 size is 100 Bytes the fist element has 246 Bytes

在数组中,每个元素有292个字节(= 73 * 4),但是在列表中,它有246个字节?

In array, each element has 292 bytes (=73 * 4) but in list, it has 246 bytes ?

任何解释将不胜感激.

Any explanation will be appreciated.

推荐答案

数组中的元素大小很简单-由dtype确定,并且如代码所示,可以通过.itemsize找到.通常使用4个字节,例如np.int32np.float64. Unicode字符串还为每个字符分配了4个字节-尽管实际的unicode使用可变数量的字符.

The element size in arrays is easy - it's determined by the dtype, and as your code shows can be found with .itemsize. 4bytes is common, such as for np.int32, np.float64. Unicode strings are also allocated 4 bytes per character - though the real unicode uses a variable number of characters.

列表(和元组)的每个元素大小比较棘手.列表不直接包含元素,而是包含指向存储在其他位置的对象的指针.您的列表大小记录了指针的数量以及一个填充.垫板可以有效地增大尺寸(使用.append).无论第一项"的大小如何,您所有列表的大小都相同.

The per element size for lists (and tuples) is trickier. A list does not contain the elements directly, rather it contains pointers to objects which are stored elsewhere. Your list size records the number of pointers, plus a pad. The pad lets it grow in size (with .append) efficiently. All your lists have the same size, regardless of 'first item' size.

我的数据:

In [2324]: lt=[None]*10
In [2325]: sys.getsizeof(lt)
Out[2325]: 72
In [2326]: lt=[i for i in range(10)]
In [2327]: sys.getsizeof(lt)
Out[2327]: 96
In [2328]: lt=['A' for i in range(10)]
In [2329]: sys.getsizeof(lt)
Out[2329]: 96
In [2330]: lt=['AB' for i in range(10)]
In [2331]: sys.getsizeof(lt)
Out[2331]: 96
In [2332]: lt=['ABCDEF' for i in range(10)]
In [2333]: sys.getsizeof(lt)
Out[2333]: 96
In [2334]: lt=[None for i in range(10)]
In [2335]: sys.getsizeof(lt)
Out[2335]: 96

以及对应的数组:

In [2344]: lt=[None]*10; a=np.array(lt)
In [2345]: a
Out[2345]: array([None, None, None, None, None, None, None, None, None, None], dtype=object)
In [2346]: a.itemsize
Out[2346]: 4
In [2347]: lt=['AB' for i in range(10)]; a=np.array(lt)
In [2348]: a
Out[2348]: 
array(['AB', 'AB', 'AB', 'AB', 'AB', 'AB', 'AB', 'AB', 'AB', 'AB'], 
      dtype='<U2')
In [2349]: a.itemsize
Out[2349]: 8

当列表包含None时,数组为对象dtype,并且元素均为指针(4个字节整数).

When the list contains None, the array is object dtype, and the elements are all pointers (4 bytes integers).

这篇关于numpy数组和列表中元素的大小不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆