为什么酸洗 numpy 数组的开销很大? [英] Why is there a large overhead in pickling numpy arrays?

查看：64 发布时间：2021/6/17 18:44:54 python numpy pickle

本文介绍了为什么酸洗 numpy 数组的开销很大?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

假设我在 Python 中有一个简单的数组:

<预><代码>>>>x = [1.0, 2.0, 3.0, 4.0]

腌制后，它的尺寸相当小:

<预><代码>>>>pickle.dumps(x).__len__()44

为什么我用一个numpy数组，大小那么大?

<预><代码>>>>xn = np.array(x)>>>pickle.dumps(xn).__len__()187

将其转换为不太精确的数据类型只会有一点帮助...

<预><代码>>>>x16 = xn.astype('float16')>>>pickle.dumps(x16).__len__()163

其他 numpy/scipy 数据结构(如稀疏矩阵)也不能很好地进行pickle.为什么?

解决方案

在调试器中检查它，一个 numpy 数组除了数据之外还有 max、min、type 等字段，我不确定 python 列表有.

由于pickling 只是二进制复制，因此其他字段也会被复制，从而导致更大的大小.

Suppose I have a simple array in Python:

>>> x = [1.0, 2.0, 3.0, 4.0]

When pickled, it is a reasonably small size:

>>> pickle.dumps(x).__len__()
44

How come if I use a numpy array, the size is so much larger?

>>> xn = np.array(x)
>>> pickle.dumps(xn).__len__()
187

Converting it to a less precise data type only helps a little bit...

>>> x16 = xn.astype('float16')
>>> pickle.dumps(x16).__len__()
163

Other numpy/scipy data structures like sparse matrices also don't pickle well. Why?

解决方案

Checking it in a debugger, a numpy array has the fields like max, min, type etc apart from the data, which I am not sure a python list has.

As pickling is just a binary copying, these other fields are also being copied, resulting in a larger size.

这篇关于为什么酸洗 numpy 数组的开销很大?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文