Python化的方式来创建numpy的数组列表的numpy的数组 [英] Pythonic way to create a numpy array from a list of numpy arrays

查看:362
本文介绍了Python化的方式来创建numpy的数组列表的numpy的数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我生成在循环中一维numpy的阵列的列表和以后这个列表转换为二维numpy的阵列。我会一直preallocated一个二维数组numpy的,如果我知道的项目提​​前数,但我不这样做,因此,我把一切都在列表中。

I generate a list of one dimensional numpy arrays in a loop and later convert this list to a 2d numpy array. I would've preallocated a 2d numpy array if i knew the number of items ahead of time, but I don't, therefore I put everything in a list.

该模拟了是如下:

>>> list_of_arrays = map(lambda x: x*ones(2), range(5))
>>> list_of_arrays
[array([ 0.,  0.]), array([ 1.,  1.]), array([ 2.,  2.]), array([ 3.,  3.]), array([ 4.,  4.])]
>>> arr = array(list_of_arrays)
>>> arr
array([[ 0.,  0.],
       [ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.],
       [ 4.,  4.]])

我的问题如下:

有没有更好的办法(performancewise)着手收集连续数字数据(在我的情况numpy的数组)的任务不是将它们放在一个列表中,然后制作numpy.array出来的(我创建一个新的OBJ和复制数据)?是否有一个可扩展的矩阵数据结构提供一个很好的测试模块?

Is there a better way (performancewise) to go about the task of collecting sequential numerical data (in my case numpy arrays) than putting them in a list and then making a numpy.array out of it (I am creating a new obj and copying the data)? Is there an "expandable" matrix data structure available in a well tested module?

我的2D矩阵的典型尺寸是100×10之间5000x10浮

A typical size of my 2d matrix would be between 100x10 and 5000x10 floats

编辑:在这个例子中,我使用的地图,但在我的实际应用我有一个for循环

In this example i'm using map, but in my actual application I have a for loop

推荐答案

假设你知道,最终阵列改编决不会比5000x10大。
然后,你可以pre-分配最大大小的数组,用数据来填充它
你去通过循环,然后用 arr.resize 剪下来的
退出循环后发现大小。

Suppose you know that the final array arr will never be larger than 5000x10. Then you could pre-allocate an array of maximum size, populate it with data as you go through the loop, and then use arr.resize to cut it down to the discovered size after exiting the loop.

测试以下建议这样做会比建造中间稍快
Python列表无论什么阵列的最终大小。

The tests below suggest doing so will be slightly faster than constructing intermediate python lists no matter what the ultimate size of the array is.

此外, arr.resize 取消分配的未使用的内存,所以最终的(尽管也许不是中间)内存占用比什么是用于<$ C小$ C> python_lists_to_array 。

Also, arr.resize de-allocates the unused memory, so the final (though maybe not the intermediate) memory footprint is smaller than what is used by python_lists_to_array.

这显示 numpy_all_the_way 更快:

% python -mtimeit -s"import test" "test.numpy_all_the_way(100)"
100 loops, best of 3: 1.78 msec per loop
% python -mtimeit -s"import test" "test.numpy_all_the_way(1000)"
100 loops, best of 3: 18.1 msec per loop
% python -mtimeit -s"import test" "test.numpy_all_the_way(5000)"
10 loops, best of 3: 90.4 msec per loop

% python -mtimeit -s"import test" "test.python_lists_to_array(100)"
1000 loops, best of 3: 1.97 msec per loop
% python -mtimeit -s"import test" "test.python_lists_to_array(1000)"
10 loops, best of 3: 20.3 msec per loop
% python -mtimeit -s"import test" "test.python_lists_to_array(5000)"
10 loops, best of 3: 101 msec per loop

这显示 numpy_all_the_way 使用较少的内存:

This shows numpy_all_the_way uses less memory:

% test.py
Initial memory usage: 19788
After python_lists_to_array: 20976
After numpy_all_the_way: 20348

test.py

test.py:

#!/usr/bin/env python
import numpy as np
import os

def memory_usage():
    pid=os.getpid()
    return next(line for line in open('/proc/%s/status'%pid).read().splitlines()
            if line.startswith('VmSize')).split()[-2]

N,M=5000,10

def python_lists_to_array(k):
    list_of_arrays = map(lambda x: x*np.ones(M), range(k))
    arr = np.array(list_of_arrays)
    return arr

def numpy_all_the_way(k):
    arr=np.empty((N,M))
    for x in range(k):
        arr[x]=x*np.ones(M)
    arr.resize((k,M))
    return arr

if __name__=='__main__':
    print('Initial memory usage: %s'%memory_usage())
    arr=python_lists_to_array(5000)
    print('After python_lists_to_array: %s'%memory_usage())    
    arr=numpy_all_the_way(5000)
    print('After numpy_all_the_way: %s'%memory_usage())    

这篇关于Python化的方式来创建numpy的数组列表的numpy的数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆