与常规 Python 列表相比,NumPy 有哪些优势? [英] What are the advantages of NumPy over regular Python lists?

查看:42
本文介绍了与常规 Python 列表相比,NumPy 有哪些优势?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

NumPy 与常规 Python 列表相比有哪些优势?

What are the advantages of NumPy over regular Python lists?

我有大约 100 个金融市场系列,我将创建一个 100x100x100 = 100 万个单元格的立方体阵列.我将回归(3 变量)每个 x 和每个 y 和 z,以用标准误差填充数组.

I have approximately 100 financial markets series, and I am going to create a cube array of 100x100x100 = 1 million cells. I will be regressing (3-variable) each x with each y and z, to fill the array with standard errors.

我听说对于大矩阵",出于性能和可扩展性的原因,我应该使用 NumPy 而不是 Python 列表.事情是,我知道 Python 列表,它们似乎对我有用.

I have heard that for "large matrices" I should use NumPy as opposed to Python lists, for performance and scalability reasons. Thing is, I know Python lists and they seem to work for me.

如果我改用 NumPy 会有什么好处?

What will the benefits be if I move to NumPy?

如果我有 1000 个系列(即多维数据集中的 10 亿个浮点单元格)会怎样?

What if I had 1000 series (that is, 1 billion floating point cells in the cube)?

推荐答案

NumPy 的数组比 Python 列表更紧凑——您描述的列表列表在 Python 中至少需要 20 MB 左右,而 NumPy单元格中具有单精度浮点数的 3D 数组将适合 4 MB.使用 NumPy 也可以更快地读取和写入项目.

NumPy's arrays are more compact than Python lists -- a list of lists as you describe, in Python, would take at least 20 MB or so, while a NumPy 3D array with single-precision floats in the cells would fit in 4 MB. Access in reading and writing items is also faster with NumPy.

也许你不太关心一百万个单元,但你肯定会关心十亿个单元——这两种方法都不适合 32 位架构,但使用 64 位构建 NumPy 会逃脱 4GB 左右,仅 Python 就需要至少大约 12 GB(大量指针加倍)——这是一个昂贵得多的硬件!

Maybe you don't care that much for just a million cells, but you definitely would for a billion cells -- neither approach would fit in a 32-bit architecture, but with 64-bit builds NumPy would get away with 4 GB or so, Python alone would need at least about 12 GB (lots of pointers which double in size) -- a much costlier piece of hardware!

差异主要是由于间接性"——Python 列表是一个指向 Python 对象的指针数组,每个指针至少有 4 个字节加上即使是最小的 Python 对象也有 16 个字节(4 个用于类型指针,4 个用于引用计数,4 表示值——内存分配器四舍五入到 16).NumPy 数组是一个统一值数组——单精度数每个占 4 个字节,双精度数占 8 个字节.不太灵活,但您为标准 Python 列表的灵活性付出了大量代价!

The difference is mostly due to "indirectness" -- a Python list is an array of pointers to Python objects, at least 4 bytes per pointer plus 16 bytes for even the smallest Python object (4 for type pointer, 4 for reference count, 4 for value -- and the memory allocators rounds up to 16). A NumPy array is an array of uniform values -- single-precision numbers takes 4 bytes each, double-precision ones, 8 bytes. Less flexible, but you pay substantially for the flexibility of standard Python lists!

这篇关于与常规 Python 列表相比,NumPy 有哪些优势?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆