对于追加numpy的阵列有效途径 [英] Efficient way for appending numpy array

查看:171
本文介绍了对于追加numpy的阵列有效途径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将坚守simple.I有一个循环,追加新行到一个数组numpy的...什么是做到这一点的有效途径。

  N = np.zeros([1,2])
在X [[2,3],[4,5],[7,6]
      N = np.append(N,X,轴= 1)

现在的事情是有个[0,0]坚持它,所以我必须通过

将其删除

 德尔N [0]

这似乎愚蠢的......所以,请告诉我一个有效的方式来做到这一点。

  N = np.empty([1,2])

更糟糕的是它创建了一个未初始化的值。


解决方案

技术解释了为什么清单的一部分了一下。

内部,对未知长度的列表的问题是,它需要以适应存储器某种程度上不管其长度。主要有两种不同的可能性:


  1. 使用这使得它可以用于在列表中的每个新的元素分别分配存储器的数据结构(链表,一些树结构等)。


  2. 在一个连续的内存区域存储数据。这个区域具有在创建列表时将被分配,并且它必须是比我们最初需要较大。如果我们得到更多的东西进入榜单,我们需要尽量在同一位置分配更多的内存,preferably。如果我们不能在同一地点做,我们需要分配一个更大的块,并将所有数据。


第一种方法使各种花式插入和删除选项,排序等。然而,它是在依次读取速度较慢,分配更多的存储器。 Python的实际使用方法#2,列表存储为动态数组。有关更多信息,请访问:

内存列表尺寸

这意味着该列表的设计是非常有效的使用的追加。很少有你可以做加快速度,如果你不知道该列表的大小事先。


如果你甚至知道名单的最大大小事前,你可能最好把分配 numpy.array 使用 numpy.empty (不是 numpy.zeros )的最大尺寸,然后用 ndarray.resize 来缩小阵列一旦你​​填写的所有数据。

由于某些原因 numpy.array(L),其中是一个列表,经常用较大名单慢,而即使复制大型阵列是相当快的(我只是尝试创建一个100 000 000元件阵列的复制,它花了不到0.5秒)。

此讨论对不同选项的详细基准:

长出numpy的数值数组 <最快方法/ p>

我没有基准的 numpy.empty + ndarray.resize 组合,但两者应该是相当微妙比毫秒的操作。

I will keep it simple.I have a loop that appends new row to a numpy array...what is the efficient way to do this.

n=np.zeros([1,2])
for x in [[2,3],[4,5],[7,6]]
      n=np.append(n,x,axis=1)

Now the thing is there is a [0,0] sticking to it so I have to remove it by

   del n[0]

Which seems dumb...So please tell me an efficient way to do this.

   n=np.empty([1,2])

is even worse it creates an uninitialised value.

解决方案

A bit of technical explanation for the "why lists" part.

Internally, the problem for a list of unknown length is that it needs to fit in memory somehow regardless of its length. There are essentially two different possibilities:

  1. Use a data structure (linked list, some tree structure, etc.) which makes it possible to allocate memory separately for each new element in a list.

  2. Store the data in a contiguous memory area. This area has to be allocated when the list is created, and it has to be larger than what we initially need. If we get more stuff into the list, we need to try to allocate more memory, preferably at the same location. If we cannot do it at the same location, we need to allocate a bigger block and move all data.

The first approach enables all sorts of fancy insertion and deletion options, sorting, etc. However, it is slower in sequential reading and allocates more memory. Python actually uses the method #2, the lists are stored as "dynamic arrays". For more information on this, please see:

Size of list in memory

What this means is that lists are designed to be very efficient with the use of append. There is very little you can do to speed things up if you do not know the size of the list beforehand.


If you know even the maximum size of the list beforehand, you are probably best off allocating a numpy.array using numpy.empty (not numpy.zeros) with the maximum size and then use ndarray.resize to shrink the array once you have filled in all data.

For some reason numpy.array(l) where l is a list is often slow with large lists, whereas copying even large arrays is quite fast (I just tried to create a copy of a 100 000 000 element array; it took less than 0.5 seconds).

This discussion has more benchmarking on different options:

Fastest way to grow a numpy numeric array

I have not benchmarked the numpy.empty + ndarray.resize combo, but both should be rather microsecond than millisecond operations.

这篇关于对于追加numpy的阵列有效途径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆