将列表转换并填充为 numpy 数组 [英] Convert and pad a list to numpy array

查看:35
本文介绍了将列表转换并填充为 numpy 数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个任意深度嵌套的列表,具有不同长度的元素

my_list = [[[1,2],[4]],[[4,4,3]],[[1,2,1],[4,3,4,5],[4,1]]]

我想通过用 NaN 填充每个轴,将其转换为有效的数字(非对象)numpy 数组.所以结果应该是这样的

padded_list = np.array([[[ 1, 2, nan, nan],[ 4, 难, 难, 难],[南,南,南,南]],[[ 4, 4, 3, 南],[楠,楠,楠,楠],[南,南,南,南]],[[ 1, 2, 1, 南],[ 4, 3, 4, 5],[ 4, 1, 南, 南]]])

我该怎么做?

解决方案

这适用于您的示例,不确定它是否可以正确处理所有极端情况:

from itertools import izip_longestdef find_shape(seq):尝试:len_ = len(seq)除了类型错误:返回 ()形状 = [find_shape(subseq) for subseq in seq]返回 (len_,) + tuple(max(sizes) for size in izip_longest(*shapes,填充值=1))def fill_array(arr, seq):如果 arr.ndim == 1:尝试:len_ = len(seq)除了类型错误:len_ = 0arr[:len_] = seqarr[len_:] = np.nan别的:对于 izip_longest(arr, seq, fillvalue=()) 中的 subarr, subseq:填充数组(subarr,subseq)

现在:

<预><代码>>>>arr = np.empty(find_shape(my_list))>>>填充数组(arr,my_list)>>>阿尔数组([[[ 1., 2., nan, nan],[ 4., 楠, 楠, 楠],[楠,楠,楠,楠]],[[4., 4., 3., 南],[楠,楠,楠,楠],[楠,楠,楠,楠]],[[ 1., 2., 1., 南],[ 4., 3., 4., 5.],[ 4., 1., nan, nan]]]])

我认为这大致就是 numpy 的形状发现例程所做的.由于无论如何都涉及大量 Python 函数调用,因此与 C 实现相比,它可能不会那么糟糕.

I have an arbitrarily deeply nested list, with varying length of elements

my_list = [[[1,2],[4]],[[4,4,3]],[[1,2,1],[4,3,4,5],[4,1]]]

I want to convert this to a valid numeric (not object) numpy array, by padding out each axis with NaN. So the result should look like

padded_list = np.array([[[  1,   2, nan, nan],
                         [  4, nan, nan, nan],
                         [nan, nan, nan, nan]],
                        [[  4,   4,   3, nan],
                         [nan, nan, nan, nan],
                         [nan, nan, nan, nan]],
                        [[   1,  2,   1, nan],
                         [   4,  3,   4,   5],
                         [   4,  1, nan, nan]]])

How do I do this?

解决方案

This works on your sample, not sure it can handle all the corner cases properly:

from itertools import izip_longest

def find_shape(seq):
    try:
        len_ = len(seq)
    except TypeError:
        return ()
    shapes = [find_shape(subseq) for subseq in seq]
    return (len_,) + tuple(max(sizes) for sizes in izip_longest(*shapes,
                                                                fillvalue=1))

def fill_array(arr, seq):
    if arr.ndim == 1:
        try:
            len_ = len(seq)
        except TypeError:
            len_ = 0
        arr[:len_] = seq
        arr[len_:] = np.nan
    else:
        for subarr, subseq in izip_longest(arr, seq, fillvalue=()):
            fill_array(subarr, subseq)

And now:

>>> arr = np.empty(find_shape(my_list))
>>> fill_array(arr, my_list)
>>> arr
array([[[  1.,   2.,  nan,  nan],
        [  4.,  nan,  nan,  nan],
        [ nan,  nan,  nan,  nan]],

       [[  4.,   4.,   3.,  nan],
        [ nan,  nan,  nan,  nan],
        [ nan,  nan,  nan,  nan]],

       [[  1.,   2.,   1.,  nan],
        [  4.,   3.,   4.,   5.],
        [  4.,   1.,  nan,  nan]]])

I think this is roughly what the shape discovery routines of numpy do. Since there are lots of Python function calls involved anyway, it probably won't compare that badly against the C implementation.

这篇关于将列表转换并填充为 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆