展平列表和标量列表 [英] flatten list of lists and scalars

查看:45
本文介绍了展平列表和标量列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以对于矩阵,我们有numpy.flatten()之类的方法

So for a matrix, we have methods like numpy.flatten()

np.array([[1,2,3],[4,5,6],[7,8,9]]).flatten()

给出 [1,2,3,4,5,6,7,8,9]

如果我想从 np.array([[1,2,3],[4,5,6],7])转换为 [1,2,3,4,5,6,7] ?
是否存在执行类似功能的现有功能?

what if I wanted to get from np.array([[1,2,3],[4,5,6],7]) to [1,2,3,4,5,6,7]?
Is there an existing function that performs something like that?

推荐答案

在列表不均匀的情况下,数组是对象dtype(且为1d,因此flatten不会对其进行更改)

With uneven lists, the array is a object dtype, (and 1d, so flatten doesn't change it)

In [96]: arr=np.array([[1,2,3],[4,5,6],7])
In [97]: arr
Out[97]: array([[1, 2, 3], [4, 5, 6], 7], dtype=object)
In [98]: arr.sum()
...
TypeError: can only concatenate list (not "int") to list

7 元素出现问题.如果我将其更改为列表:

The 7 element is giving problems. If I change that to a list:

In [99]: arr=np.array([[1,2,3],[4,5,6],[7]])
In [100]: arr.sum()
Out[100]: [1, 2, 3, 4, 5, 6, 7]

我在这里使用一个把戏.数组列表的元素以及列表 [1,2,3] + [4,5] 的元素是串联的.

I'm using a trick here. The elements of the array lists, and for lists [1,2,3]+[4,5] is concatenate.

基本要点是对象数组不是二维数组.在许多方面,它更像是列表列表.

The basic point is that an object array is not a 2d array. It is, in many ways, more like a list of lists.

最好的列表平整度是 chain

In [104]: list(itertools.chain(*arr))
Out[104]: [1, 2, 3, 4, 5, 6, 7]

尽管它也会阻塞整数7版本.

though it too will choke on the integer 7 version.

如果数组是列表列表(而不是列表和标量的原始组合),则 np.concatenate 起作用.像在一个列表上一样在对象上进行迭代.

If the array is a list of lists (not the original mix of lists and scalar) then np.concatenate works. It iterates on the object just as though it were a list.

使用混合的原始列表 concatenate 不起作用,但是 hstack 起作用

With the mixed original list concatenate does not work, but hstack does

In [178]: arr=np.array([[1,2,3],[4,5,6],7])
In [179]: np.concatenate(arr)
...
ValueError: all the input arrays must have same number of dimensions
In [180]: np.hstack(arr)
Out[180]: array([1, 2, 3, 4, 5, 6, 7])

这是因为 hstack 首先遍历列表,并确保所有元素都是 atleast_1d .这种额外的迭代使其更健壮,但会增加处理速度.

That's because hstack first iterates though the list and makes sure all elements are atleast_1d. This extra iteration makes it more robust, but at a cost in processing speed.

In [170]: big1=arr.repeat(1000)
In [171]: timeit big1.sum()
10 loops, best of 3: 31.6 ms per loop
In [172]: timeit list(itertools.chain(*big1))
1000 loops, best of 3: 433 µs per loop
In [173]: timeit np.concatenate(big1)
100 loops, best of 3: 5.05 ms per loop

放大一倍

In [174]: big1=arr.repeat(2000)
In [175]: timeit big1.sum()
10 loops, best of 3: 128 ms per loop
In [176]: timeit list(itertools.chain(*big1))
1000 loops, best of 3: 803 µs per loop
In [177]: timeit np.concatenate(big1)
100 loops, best of 3: 9.93 ms per loop
In [182]: timeit np.hstack(big1)    # the extra iteration hurts hstack speed
10 loops, best of 3: 43.1 ms per loop

sum 的大小是平方

res=[]
for e in bigarr: 
   res += e

res 随着e的增加而增加,因此每个迭代步骤都更加昂贵.

res grows with the number of e, so each iteration step is more expensive.

chain 乘以最好的时间.

这篇关于展平列表和标量列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆