脾气暴躁的多维数据集拆分成多维数据集 [英] Numpy split cube into cubes

查看:100
本文介绍了脾气暴躁的多维数据集拆分成多维数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一个函数np.split()可以沿1轴拆分数组.我想知道是否有一个多轴版本,例如可以沿轴(0,1,2)拆分.

There is a function np.split() which can split an array along 1 axis. I was wondering if there was a multi axis version where you can split along axes (0,1,2) for example.

推荐答案

假设cube具有形状(W, H, D),并且您希望将其分解为N个形状为(w, h, d)的小立方体.由于NumPy数组具有固定长度的轴,因此w必须均匀地划分W,并且对于hd同样.

Suppose the cube has shape (W, H, D) and you wish to break it up into N little cubes of shape (w, h, d). Since NumPy arrays have axes of fixed length, w must evenly divide W, and similarly for h and d.

然后有一种方法可以将形状为(W, H, D)的多维数据集重塑为形状为(N, w, h, d)的新数组.

Then there is a way to reshape the cube of shape (W, H, D) into a new array of shape (N, w, h, d).

例如,如果arr = np.arange(4*4*4).reshape(4,4,4)(所以(W,H,D) = (4,4,4))并且我们希望将其分解为形状为(2,2,2)的立方体,则可以使用

For example, if arr = np.arange(4*4*4).reshape(4,4,4) (so (W,H,D) = (4,4,4)) and we wish to break it up into cubes of shape (2,2,2), then we could use

In [283]: arr.reshape(2,2,2,2,2,2).transpose(0,2,4,1,3,5).reshape(-1,2,2,2)
Out[283]: 
array([[[[ 0,  1],
         [ 4,  5]],

        [[16, 17],
         [20, 21]]],

...
       [[[42, 43],
         [46, 47]],

        [[58, 59],
         [62, 63]]]])

这里的想法是向数组添加额外的轴,这些轴可以用作位置标记:

The idea here is to add extra axes to the array which sort of act as place markers:

 number of repeats act as placemarkers
 o---o---o
 |   |   |
 v   v   v
(2,2,2,2,2,2)
   ^   ^   ^
   |   |   |
   o---o---o
   newshape

然后我们可以使用transpose对轴进行重新排序,以使重复次数排在最前面,而新形状则在末尾:

We can then reorder the axes (using transpose) so that the number of repeats comes first, and the newshape comes at the end:

arr.reshape(2,2,2,2,2,2).transpose(0,2,4,1,3,5)

最后,调用reshape(-1, w, h, d)将所有地标轴压缩为一个轴.这样会生成形状为(N, w, h, d)的数组,其中N是小立方体的数量.

And finally, call reshape(-1, w, h, d) to squash all the placemarking axes into a single axis. This produces an array of shape (N, w, h, d) where N is the number of little cubes.

上面使用的想法是将此想法的概括化为3个维度.可以将其进一步推广为任意维度的ndarray:

The idea used above is a generalization of this idea to 3 dimensions. It can be further generalized to ndarrays of any dimension:

import numpy as np
def cubify(arr, newshape):
    oldshape = np.array(arr.shape)
    repeats = (oldshape / newshape).astype(int)
    tmpshape = np.column_stack([repeats, newshape]).ravel()
    order = np.arange(len(tmpshape))
    order = np.concatenate([order[::2], order[1::2]])
    # newshape must divide oldshape evenly or else ValueError will be raised
    return arr.reshape(tmpshape).transpose(order).reshape(-1, *newshape)

print(cubify(np.arange(4*6*16).reshape(4,6,16), (2,3,4)).shape)
print(cubify(np.arange(8*8*8*8).reshape(8,8,8,8), (2,2,2,2)).shape)

产生新的形状数组

(16, 2, 3, 4)
(256, 2, 2, 2, 2)


要取消整理"数组:


To "uncubify" the arrays:

def uncubify(arr, oldshape):
    N, newshape = arr.shape[0], arr.shape[1:]
    oldshape = np.array(oldshape)    
    repeats = (oldshape / newshape).astype(int)
    tmpshape = np.concatenate([repeats, newshape])
    order = np.arange(len(tmpshape)).reshape(2, -1).ravel(order='F')
    return arr.reshape(tmpshape).transpose(order).reshape(oldshape)


下面是一些测试代码,用于检查cubifyuncubify是否为逆.


Here is some test code to check that cubify and uncubify are inverses.

import numpy as np
def cubify(arr, newshape):
    oldshape = np.array(arr.shape)
    repeats = (oldshape / newshape).astype(int)
    tmpshape = np.column_stack([repeats, newshape]).ravel()
    order = np.arange(len(tmpshape))
    order = np.concatenate([order[::2], order[1::2]])
    # newshape must divide oldshape evenly or else ValueError will be raised
    return arr.reshape(tmpshape).transpose(order).reshape(-1, *newshape)

def uncubify(arr, oldshape):
    N, newshape = arr.shape[0], arr.shape[1:]
    oldshape = np.array(oldshape)    
    repeats = (oldshape / newshape).astype(int)
    tmpshape = np.concatenate([repeats, newshape])
    order = np.arange(len(tmpshape)).reshape(2, -1).ravel(order='F')
    return arr.reshape(tmpshape).transpose(order).reshape(oldshape)

tests = [[np.arange(4*6*16), (4,6,16), (2,3,4)],
         [np.arange(8*8*8*8), (8,8,8,8), (2,2,2,2)]]

for arr, oldshape, newshape in tests:
    arr = arr.reshape(oldshape)
    assert np.allclose(uncubify(cubify(arr, newshape), oldshape), arr)
    # cuber = Cubify(oldshape,newshape)
    # assert np.allclose(cuber.uncubify(cuber.cubify(arr)), arr)

这篇关于脾气暴躁的多维数据集拆分成多维数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆