如何在NumPy中堆叠不同长度的向量? [英] How do I stack vectors of different lengths in NumPy?

查看:118
本文介绍了如何在NumPy中堆叠不同长度的向量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何堆叠形状为(x,)的列式n向量,其中x可以是任何数字?

例如

from numpy import *
a = ones((3,))
b = ones((2,))

c = vstack((a,b)) # <-- gives an error
c = vstack((a[:,newaxis],b[:,newaxis])) #<-- also gives an error

hstack可以正常工作,但是连接方向错误.

解决方案

简短的回答:不能. NumPy本机不支持锯齿数组.

长答案:

>>> a = ones((3,))
>>> b = ones((2,))
>>> c = array([a, b])
>>> c
array([[ 1.  1.  1.], [ 1.  1.]], dtype=object)

提供一个数组,该数组可以可以或不可以按照您的期望进行操作.例如.它不支持sumreshape之类的基本方法,因此您应该像对待普通的Python列表[a, b]一样对待它(对其进行迭代以执行操作,而不使用向量化的惯用法).

存在几种可能的解决方法;最简单的方法是使用掩码数组或NaN表示某些索引在某些行中无效.例如.这是b作为掩码数组:

>>> ma.array(np.resize(b, a.shape[0]), mask=[False, False, True])
masked_array(data = [1.0 1.0 --],
             mask = [False False  True],
       fill_value = 1e+20)

可以将其与a堆叠在一起,如下所示:

>>> ma.vstack([a, ma.array(np.resize(b, a.shape[0]), mask=[False, False, True])])
masked_array(data =
 [[1.0 1.0 1.0]
 [1.0 1.0 --]],
             mask =
 [[False False False]
 [False False  True]],
       fill_value = 1e+20)

(出于某些目的,scipy.sparse可能也很有趣.)

How do I stack column-wise n vectors of shape (x,) where x could be any number?

For example,

from numpy import *
a = ones((3,))
b = ones((2,))

c = vstack((a,b)) # <-- gives an error
c = vstack((a[:,newaxis],b[:,newaxis])) #<-- also gives an error

hstack works fine but concatenates along the wrong dimension.

解决方案

Short answer: you can't. NumPy does not support jagged arrays natively.

Long answer:

>>> a = ones((3,))
>>> b = ones((2,))
>>> c = array([a, b])
>>> c
array([[ 1.  1.  1.], [ 1.  1.]], dtype=object)

gives an array that may or may not behave as you expect. E.g. it doesn't support basic methods like sum or reshape, and you should treat this much as you'd treat the ordinary Python list [a, b] (iterate over it to perform operations instead of using vectorized idioms).

Several possible workarounds exist; the easiest is to coerce a and b to a common length, perhaps using masked arrays or NaN to signal that some indices are invalid in some rows. E.g. here's b as a masked array:

>>> ma.array(np.resize(b, a.shape[0]), mask=[False, False, True])
masked_array(data = [1.0 1.0 --],
             mask = [False False  True],
       fill_value = 1e+20)

This can be stacked with a as follows:

>>> ma.vstack([a, ma.array(np.resize(b, a.shape[0]), mask=[False, False, True])])
masked_array(data =
 [[1.0 1.0 1.0]
 [1.0 1.0 --]],
             mask =
 [[False False False]
 [False False  True]],
       fill_value = 1e+20)

(For some purposes, scipy.sparse may also be interesting.)

这篇关于如何在NumPy中堆叠不同长度的向量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆