为什么numpy.array()有时会很慢? [英] Why is numpy.array() is sometimes very slow?
本文介绍了为什么numpy.array()有时会很慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用numpy.array()函数从列表中创建numpy.float64 ndarrays.
I'm using the numpy.array() function to create numpy.float64 ndarrays from lists.
我注意到当列表不包含任何内容或提供列表时,这非常慢.
I noticed that this is very slow when either the list contains None or a list of lists is provided.
下面是一些带有时间的示例.有明显的解决方法,但是为什么这么慢?
Below are some examples with times. There are obvious workarounds but why is this so slow?
无"列表的示例:
### Very slow to call array() with list of None
In [3]: %timeit numpy.array([None]*100000, dtype=numpy.float64)
1 loops, best of 3: 240 ms per loop
### Problem doesn't exist with array of zeroes
In [4]: %timeit numpy.array([0.0]*100000, dtype=numpy.float64)
100 loops, best of 3: 9.94 ms per loop
### Also fast if we use dtype=object and convert to float64
In [5]: %timeit numpy.array([None]*100000, dtype=numpy.object).astype(numpy.float64)
100 loops, best of 3: 4.92 ms per loop
### Also fast if we use fromiter() insead of array()
In [6]: %timeit numpy.fromiter([None]*100000, dtype=numpy.float64)
100 loops, best of 3: 3.29 ms per loop
列表列表示例:
### Very slow to create column matrix
In [7]: %timeit numpy.array([[0.0]]*100000, dtype=numpy.float64)
1 loops, best of 3: 353 ms per loop
### No problem to create column vector and reshape
In [8]: %timeit numpy.array([0.0]*100000, dtype=numpy.float64).reshape((-1,1))
100 loops, best of 3: 10 ms per loop
### Can use itertools to flatten input lists
In [9]: %timeit numpy.fromiter(itertools.chain.from_iterable([[0.0]]*100000),dtype=numpy.float64).reshape((-1,1))
100 loops, best of 3: 9.65 ms per loop
推荐答案
我将其报告为一个numpy问题.报告和补丁文件在这里:
I've reported this as a numpy issue. The report and patch files are here:
https://github.com/numpy/numpy/issues/3392
修补后:
# was 240 ms, best alternate version was 3.29
In [5]: %timeit numpy.array([None]*100000)
100 loops, best of 3: 7.49 ms per loop
# was 353 ms, best alternate version was 9.65
In [6]: %timeit numpy.array([[0.0]]*100000)
10 loops, best of 3: 23.7 ms per loop
这篇关于为什么numpy.array()有时会很慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文