为什么numpy.array()有时会很慢? [英] Why is numpy.array() is sometimes very slow?

查看:211
本文介绍了为什么numpy.array()有时会很慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用numpy.array()函数从列表中创建numpy.float64 ndarrays.

I'm using the numpy.array() function to create numpy.float64 ndarrays from lists.

我注意到当列表不包含任何内容或提供列表时,这非常慢.

I noticed that this is very slow when either the list contains None or a list of lists is provided.

下面是一些带有时间的示例.有明显的解决方法,但是为什么这么慢?

Below are some examples with times. There are obvious workarounds but why is this so slow?

无"列表的示例:

### Very slow to call array() with list of None
In [3]: %timeit numpy.array([None]*100000, dtype=numpy.float64)
1 loops, best of 3: 240 ms per loop

### Problem doesn't exist with array of zeroes
In [4]: %timeit numpy.array([0.0]*100000, dtype=numpy.float64)
100 loops, best of 3: 9.94 ms per loop

### Also fast if we use dtype=object and convert to float64
In [5]: %timeit numpy.array([None]*100000, dtype=numpy.object).astype(numpy.float64)
100 loops, best of 3: 4.92 ms per loop

### Also fast if we use fromiter() insead of array()
In [6]: %timeit numpy.fromiter([None]*100000, dtype=numpy.float64)
100 loops, best of 3: 3.29 ms per loop

列表列表示例:

### Very slow to create column matrix
In [7]: %timeit numpy.array([[0.0]]*100000, dtype=numpy.float64)
1 loops, best of 3: 353 ms per loop

### No problem to create column vector and reshape
In [8]: %timeit numpy.array([0.0]*100000, dtype=numpy.float64).reshape((-1,1))
100 loops, best of 3: 10 ms per loop

### Can use itertools to flatten input lists
In [9]: %timeit numpy.fromiter(itertools.chain.from_iterable([[0.0]]*100000),dtype=numpy.float64).reshape((-1,1))
100 loops, best of 3: 9.65 ms per loop

推荐答案

我将其报告为一个numpy问题.报告和补丁文件在这里:

I've reported this as a numpy issue. The report and patch files are here:

https://github.com/numpy/numpy/issues/3392

修补后:

# was 240 ms, best alternate version was 3.29
In [5]: %timeit numpy.array([None]*100000)
100 loops, best of 3: 7.49 ms per loop

# was 353 ms, best alternate version was 9.65
In [6]: %timeit numpy.array([[0.0]]*100000)
10 loops, best of 3: 23.7 ms per loop

这篇关于为什么numpy.array()有时会很慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆