与常规Python列表相比,NumPy有什么优势? [英] What are the advantages of NumPy over regular Python lists?

查看:314
本文介绍了与常规Python列表相比,NumPy有什么优势?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

与常规Python列表相比, NumPy 有什么优势?

What are the advantages of NumPy over regular Python lists?

我有大约100个金融市场系列,我将创建一个100x100x100 = 1百万个单元的多维数据集阵列.我将每个x与y和z回归(3变量),以用标准误差填充数组.

I have approximately 100 financial markets series, and I am going to create a cube array of 100x100x100 = 1 million cells. I will be regressing (3-variable) each x with each y and z, to fill the array with standard errors.

我听说对于大型矩阵",出于性能和可伸缩性的原因,我应该使用NumPy而不是Python列表.事实是,我知道Python列表,它们似乎对我有用.

I have heard that for "large matrices" I should use NumPy as opposed to Python lists, for performance and scalability reasons. Thing is, I know Python lists and they seem to work for me.

如果我转到NumPy,会有什么好处?

What will the benefits be if I move to NumPy?

如果我有1000个序列(即立方体中有10亿个浮点单元)怎么办?

What if I had 1000 series (that is, 1 billion floating point cells in the cube)?

推荐答案

NumPy的数组比Python列表更紧凑-您所描述的列表列表在Python中至少需要20 MB左右,而NumPy单元格中具有单精度浮点数的3D数组可容纳4 MB.使用NumPy可以更快地读取和写入项目.

NumPy's arrays are more compact than Python lists -- a list of lists as you describe, in Python, would take at least 20 MB or so, while a NumPy 3D array with single-precision floats in the cells would fit in 4 MB. Access in reading and writing items is also faster with NumPy.

也许您只关心一百万个单元格就不会那么在意,但是您肯定会关心十亿个单元格-两种方法都不适合32位体系结构,但是使用64位版本,NumPy可以摆脱4大约需要1 GB的内存,仅Python就需要至少12 GB(很多指针的大小加倍),这是一个昂贵得多的硬件!

Maybe you don't care that much for just a million cells, but you definitely would for a billion cells -- neither approach would fit in a 32-bit architecture, but with 64-bit builds NumPy would get away with 4 GB or so, Python alone would need at least about 12 GB (lots of pointers which double in size) -- a much costlier piece of hardware!

差异主要是由于间接性"造成的-Python列表是指向Python对象的指针的数组,每个指针至少4个字节,对于最小的Python对象至少也要16个字节(类型指针为4个,引用为4个)计数(4为值)-内存分配器舍入为16). NumPy数组是统一值的数组-单精度数字每个占用4个字节,双精度数字每个占用8个字节.灵活性较差,但是您需要为标准Python列表的灵活性付出巨大的代价!

The difference is mostly due to "indirectness" -- a Python list is an array of pointers to Python objects, at least 4 bytes per pointer plus 16 bytes for even the smallest Python object (4 for type pointer, 4 for reference count, 4 for value -- and the memory allocators rounds up to 16). A NumPy array is an array of uniform values -- single-precision numbers takes 4 bytes each, double-precision ones, 8 bytes. Less flexible, but you pay substantially for the flexibility of standard Python lists!

这篇关于与常规Python列表相比,NumPy有什么优势?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆