NumPy数组与Python数组 [英] Numpy arrays vs Python arrays
问题描述
我注意到,Python中用于数组操作的事实标准是通过出色的numpy
库实现的.但是,我知道Python标准库有一个array
模块,在我看来,它的用例与Numpy类似.
I noticed that the de facto standard for array manipulation in Python is through the excellent numpy
library. However, I know that the Python Standard Library has an array
module, which seems to me to have a similar use-case as Numpy.
在现实世界中,有没有哪个示例比numpy
或普通list
更适合array
?
Is there any actual real-world example where array
is desirable over numpy
or just plain list
?
根据我的幼稚解释,array
只是用于存储同类数据的内存有效容器,但无法提高计算效率.
From my naive interpretation, array
is just memory-efficient container for homogeneous data, but offers no means of improving computational efficiency.
编辑
出于好奇,我通过Github和import array
搜索Python命中186'721的计数,而import numpy
发现8'062'678的计数.
Just out of curiosity, I searched through Github and import array
for Python hits 186'721 counts, while import numpy
hits 8'062'678 counts.
但是,我找不到使用array
的流行存储库.
However, I could not find a popular repository using array
.
推荐答案
为了解numpy
和array
之间的区别,我进行了一些定量测试.
To understand the differences between numpy
and array
, I ran a few more quantitative test.
我发现的是,对于我的系统(Ubuntu 18.04,Python3),array
从range
生成器生成大型数组的速度似乎是numpy
的两倍(尽管numpy
专用的np.arange()
似乎更快-实际上太快了,也许它正在测试期间缓存某些东西),但是比使用list
慢了一倍.
What I have found is that, for my system (Ubuntu 18.04, Python3), array
seems to be twice as fast at generating a large array from the range
generator compared to numpy
(although numpy
's dedicated np.arange()
seems to be much faster -- actually too fast, and perhaps it is caching something during tests), but twice as slow than using list
.
但是,非常令人惊讶,array
个对象似乎比numpy
个对象大.
相反,list
对象比array
对象大大约8-13%(显然,这将随单个项目的大小而变化).
与list
相比,array
提供了一种控制数字对象大小的方法.
However, quite surprisingly, array
objects seems to be larger than the numpy
counterparts.
Instead, the list
objects are roughly 8-13% larger than array
objects (this will vary with the size of the individual items, obviously).
Compared to list
, array
offers a way to control the size of the number objects.
因此,也许array
的唯一合理用例实际上是当numpy
不可用时.
So, perhaps, the only sensible use case for array
is actually when numpy
is not available.
为完整起见,这是我用于测试的代码:
For completeness, here is the code that I used for the tests:
import numpy as np
import array
import sys
num = int(1e6)
num_i = 100
x = np.logspace(1, int(np.log10(num)), num_i).astype(int)
%timeit list(range(num))
# 10 loops, best of 3: 32.8 ms per loop
%timeit array.array('l', range(num))
# 10 loops, best of 3: 86.3 ms per loop
%timeit np.array(range(num), dtype=np.int64)
# 10 loops, best of 3: 180 ms per loop
%timeit np.arange(num, dtype=np.int64)
# 1000 loops, best of 3: 809 µs per loop
y_list = np.array([sys.getsizeof(list(range(x_i))) for x_i in x])
y_array = np.array([sys.getsizeof(array.array('l', range(x_i))) for x_i in x])
y_np = np.array([sys.getsizeof(np.array(range(x_i), dtype=np.int64)) for x_i in x])
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 6))
plt.plot(x, y_list, label='list')
plt.plot(x, y_array, label='array')
plt.plot(x, y_np, label='numpy')
plt.legend()
plt.show()
这篇关于NumPy数组与Python数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!