将具有已知索引的字典转换为多维数组 [英] Converting dictionary with known indices to a multidimensional array

查看:104
本文介绍了将具有已知索引的字典转换为多维数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一本字典,其中的条目标记为{(k,i): value, ...}.我现在想将此字典转换为2d数组,其中为位置[k,i]处的数组元素提供的值是来自带有标签(k,i)的字典中的值.行的长度不必一定是相同的大小(例如,行k = 4可能会上升到索引i = 60,而行k = 24可能会上升到索引i = 31).由于不对称,可以使特定行中的所有其他条目都等于0,以便具有矩形矩阵.

I have a dictionary with entries labelled as {(k,i): value, ...}. I now want to convert this dictionary into a 2d array where the value given for an element of the array at position [k,i] is the value from the dictionary with label (k,i). The length of the rows will not necessarily be of the same size (e.g. row k = 4 may go up to index i = 60 while row k = 24 may go up to index i = 31). Due to the asymmetry, it is fine to make all additional entries in a particular row equal to 0 in order to have a rectangular matrix.

推荐答案

这是一种方法-

# Get keys (as indices for output) and values as arrays
idx = np.array(d.keys())
vals = np.array(d.values())

# Get dimensions of output array based on max extents of indices
dims = idx.max(0)+1

# Setup output array and assign values into it indexed by those indices
out = np.zeros(dims,dtype=vals.dtype)
out[idx[:,0],idx[:,1]] = vals

我们还可以使用稀疏矩阵来获得最终输出.例如与 coordinate format sparse matrices .当保存为稀疏矩阵时,这将提高内存效率.因此,最后一步可以替换为这样的内容-

We could also use sparse matrices to get the final output. e.g. with coordinate format sparse matrices. This would be memory efficient when kept as sparse matrices. So, the last step could be replaced by something like this -

from scipy.sparse import coo_matrix

out = coo_matrix((vals, (idx[:,0], idx[:,1])), dims).toarray()

样品运行-

In [70]: d
Out[70]: {(1, 4): 120, (2, 2): 72, (2, 3): 100, (5, 2): 88}

In [71]: out
Out[71]: 
array([[  0,   0,   0,   0,   0],
       [  0,   0,   0,   0, 120],
       [  0,   0,  72, 100,   0],
       [  0,   0,   0,   0,   0],
       [  0,   0,   0,   0,   0],
       [  0,   0,  88,   0,   0]])


要使其对任意数量的ndarray通用,我们可以使用线性索引并使用np.put将值分配到输出数组中.因此,在我们的第一种方法中,只需将像这样分配值的最后一步替换为-


To make it generic for ndarrays of any number of dimensions, we can use linear-indexing and use np.put to assign values into the output array. Thus, in our first approach, just replace the last step of assigning values with something like this -

np.put(out,np.ravel_multi_index(idx.T,dims),vals)

样品运行-

In [106]: d
Out[106]: {(1,0,0): 99, (1,0,4): 120, (2,0,2): 72, (2,1,3): 100, (3,0,2): 88}

In [107]: out
Out[107]: 
array([[[  0,   0,   0,   0,   0],
        [  0,   0,   0,   0,   0]],

       [[ 99,   0,   0,   0, 120],
        [  0,   0,   0,   0,   0]],

       [[  0,   0,  72,   0,   0],
        [  0,   0,   0, 100,   0]],

       [[  0,   0,  88,   0,   0],
        [  0,   0,   0,   0,   0]]])

这篇关于将具有已知索引的字典转换为多维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆