用子矩阵替换numpy矩阵元素 [英] Replace numpy matrix elements with submatrices

查看：511 发布时间：2018/8/2 14:05:06 python performance numpy indexing vectorization

本文介绍了用子矩阵替换numpy矩阵元素的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

鉴于我有一个方形矩阵的索引，例如：

Given that I have a square matrix of indices, such as:

idxs = np.array([[1, 1],
                 [0, 1]])

以及一系列方形矩阵相同的大小（不一定与 idxs 相同）：

and an array of square matrices of the same size as each other (not necessarily the same size as idxs):

mats = array([[[ 0. ,  0. ],
               [ 0. ,  0.5]],

              [[ 1. ,  0.3],
               [ 1. ,  1. ]]])

我想替换<$ c中的每个索引$ c> idxs ，在席位中有相应的矩阵，以获取：

I'd like to replace each index in idxs with the corresponding matrix in mats, to obtain:

array([[ 1. ,  0.3,  1. ,  0.3],
       [ 1. ,  1. ,  1. ,  1. ],
       [ 0. ,  0. ,  1. ,  0.3],
       [ 0. ,  0.5,  1. ,  1. ]])

mats [idxs] 给我一个嵌套版本：

array([[[[ 1. ,  0.3],
         [ 1. ,  1. ]],

        [[ 1. ,  0.3],
         [ 1. ,  1. ]]],


       [[[ 0. ,  0. ],
         [ 0. ,  0.5]],

        [[ 1. ,  0.3],
         [ 1. ,  1. ]]]])

所以我尝试使用 reshape ，但是'twas徒劳无功！ mats [idxs] .reshape（4,4）返回：

and so I tried using reshape, but 'twas in vain! mats[idxs].reshape(4,4) returns:

array([[ 1. ,  0.3,  1. ,  1. ],
       [ 1. ,  0.3,  1. ,  1. ],
       [ 0. ,  0. ,  0. ,  0.5],
       [ 1. ,  0.3,  1. ,  1. ]])

如果有帮助，我发现 skimage.util.view_as_blocks 与我需要的完全相反（它可以将我想要的结果转换为嵌套的垫[idxs ] 表格。）

If it helps, I found that skimage.util.view_as_blocks is the exact inverse of what I need (it can convert my desired result into the nested, mats[idxs] form).

是否有（希望非常）快速的方法来做到这一点？对于应用程序，我的 mats 仍然只有几个小矩阵，但我的 idxs 将是一个方形矩阵最多订购2 ^ 15，在这种情况下，我将替换超过一百万个索引来创建一个2 ^ 16的新矩阵。

Is there a (hopefully very) fast way to do this? For the application, my mats will still have just a few small matrices, but my idxs will be a square matrix of up to order 2^15, in which case I'll be replacing over a million indices to create a new matrix of order 2^16.

非常感谢你的帮助！

推荐答案

我们正在使用这些索引索引到输入数组的第一个轴。要获得 2D 输出，我们只需要置换轴并重新整形。因此，一种方法是使用 np.transpose / np.swapaxes 和 np.reshape ，就像这样 -

We are indexing into the first axis of the input array with those indices. To get the 2D output, we just need to permute axes and reshape afterwards. Thus, an approach would be with np.transpose/np.swapaxes and np.reshape, like so -

mats[idxs].swapaxes(1,2).reshape(-1,mats.shape[-1]*idxs.shape[-1])

样品运行 -

In [83]: mats
Out[83]: 
array([[[1, 1],
        [7, 1]],

       [[6, 6],
        [5, 8]],

       [[7, 1],
        [6, 0]],

       [[2, 7],
        [0, 4]]])

In [84]: idxs
Out[84]: 
array([[2, 3],
       [0, 3],
       [1, 2]])

In [85]: mats[idxs].swapaxes(1,2).reshape(-1,mats.shape[-1]*idxs.shape[-1])
Out[85]: 
array([[7, 1, 2, 7],
       [6, 0, 0, 4],
       [1, 1, 2, 7],
       [7, 1, 0, 4],
       [6, 6, 7, 1],
       [5, 8, 6, 0]])

性能提升< a href =https://docs.scipy.org/doc/numpy/reference/generated/numpy.take.html\"rel =nofollow noreferrer> np.take 重复索引

对于重复索引，对于性能我们最好使用 np.take 通过索引 axis = 0 。让我们列出这些方法和时间 idxs 有多个重复索引。

With repeated indices, for performance we are better off using np.take by indexing along axis=0. Let's list out both these approaches and time it with idxs having many repeated indices.

函数定义 -

def simply_indexing_based(mats, idxs): ncols = mats.shape[-1]*idxs.shape[-1] return mats[idxs].swapaxes(1,2).reshape(-1,ncols) def take_based(mats, idxs):np.take(mats,idxs,axis=0) ncols = mats.shape[-1]*idxs.shape[-1] return np.take(mats,idxs,axis=0).swapaxes(1,2).reshape(-1,ncols)

运行时测试 -

In [156]: mats = np.random.randint(0,9,(10,2,2)) In [157]: idxs = np.random.randint(0,10,(1000,1000)) # This ensures many repeated indices In [158]: out1 = simply_indexing_based(mats, idxs) In [159]: out2 = take_based(mats, idxs) In [160]: np.allclose(out1, out2) Out[160]: True In [161]: %timeit simply_indexing_based(mats, idxs) 10 loops, best of 3: 41.2 ms per loop In [162]: %timeit take_based(mats, idxs) 10 loops, best of 3: 27.3 ms per loop

因此，我们看到 1.5x + 。
Thus, we are seeing an overall improvement of 1.5x+. 为了了解 np.take 的改善情况，让我们来吧仅索引部分 - Just to get a sense of the improvement with np.take, let's time the indexing part alone - In [168]: %timeit mats[idxs] 10 loops, best of 3: 22.8 ms per loop In [169]: %timeit np.take(mats,idxs,axis=0) 100 loops, best of 3: 8.88 ms per loop 对于那些数据，其 2.5x + 即可。还不错！ For those datasizes, its 2.5x+. Not bad! 这篇关于用子矩阵替换numpy矩阵元素的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用子矩阵替换numpy矩阵元素 [英] Replace numpy matrix elements with submatrices

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

用子矩阵替换numpy矩阵元素 [英] Replace numpy matrix elements with submatrices

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭