将不等长列表的列表重塑为numpy数组 [英] Reshape list of unequal length lists into numpy array

查看:127
本文介绍了将不等长列表的列表重塑为numpy数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有 dtype = object 的特定数组,该数组元素表示不同时间的坐标对,我想将其重塑为更简单的格式.我设法做到了一次",但是我无法让它一直用于所有时间的观测.

I have a specific array with dtype = object, the array elements represent couples of coordinates at different times and I want to reshape it into an easier format. I managed to do this for "one time", but I can't get it to work for all time observations.

每个观察的长度是不同的,所以也许我必须使用掩码值来做到这一点.下面是一个示例,我希望可以更好地解释我想要的内容.

The length of each observation is different so perhaps I must use masked values to do that. Below is an example that I hope explains better what I want.

# My "input" is:
a = np.array([[], [(2, 0), (2, 2)], [(2, 2), (2, 0), (2, 1), (2, 2)]], dtype=object)

#And my "output" is:

#holding_array_VBPnegl
array([[2, 0],
       [2, 2],
       [2, 1]])

#It doesnt consider my for loop in a.shape[0], so the expected result is :
test = np.array([[[True, True],
       [True, True],
       [True, True]],

       [[2, 0],
       [2, 2],
       [True, True]]

       [[2, 0],
       [2, 2],
       [2, 1]]])

#with "True" the masked values

我尝试使用在StackOverflow上找到的代码:

I have tried using code I found on StackOverflow:

import numpy as np

holding_list_VBPnegl=[]
for i in range(a.shape[0]):
    for x in a[i]:
        if x in holding_list_VBPnegl:
            pass
        else:
            holding_list_VBPnegl.append(x)

print holding_list_VBPnegl
holding_array_VBPnegl = np.asarray(holding_list_VBPnegl)

推荐答案

Numpy数组理想地用于连续内存块,因此您首先需要预先分配所需的内存量.您可以从数组 a 的长度中获取它(我很乐意将其转换为列表-不要滥用numpy数组来存储不等长的列表)(您将观察结果称为序列时间步长,是吗?)和最长观测值的长度(在本例中为4, a 的最后一个元素).

Numpy arrays are ideally used for blocks of contiguous memory, so you'll first need to preallocate the required amount of memory. You can get this from the length of your array a (which I'll gladly cast to a list - don't abuse numpy arrays for storing unequal length lists) (you refer to the observations as a sequence of timesteps, yes?) and the length of the longest observation (in this case 4, a's last element).

import numpy as np
a = np.array([[], [(2, 0), (2, 2)], [(2, 2), (2, 0), (2, 1), (2, 2)]], dtype=object)

s = a.tolist()  # Lists are a better container type for your data...
cols = len(s)
rows = max( len(l) for l in s)

m = np.ones((cols, rows, 2))*np.nan

现在,您已经预先分配了所需的内容,并将阵列设置为可屏蔽.您现在只需要用已有的数据填充数组:

Now you've preallocated what you need and set the array ready for masking. You only need to fill the array now with the data you already have:

for rowind, row in enumerate(s):
    try:
        m[rowind, :len(row),:] = np.array(row)
    except ValueError:
        pass  # broadcasting error: row is empty

result = np.ma.masked_array(m.astype(np.int), mask=np.isnan(m))
result
masked_array(data =
 [[[-- --]
  [-- --]
  [-- --]
  [-- --]]

 [[2 0]
  [2 2]
  [-- --]
  [-- --]]

 [[2 2]
  [2 0]
  [2 1]
  [2 2]]],
             mask =
 [[[ True  True]
  [ True  True]
  [ True  True]
  [ True  True]]

 [[False False]
  [False False]
  [ True  True]
  [ True  True]]

 [[False False]
  [False False]
  [False False]
  [False False]]],
       fill_value = 999999)

这篇关于将不等长列表的列表重塑为numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆