Theano max_pool_3d [英] Theano max_pool_3d

查看:124
本文介绍了Theano max_pool_3d的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何有效地扩展theanos downsample.max_pool_2d_same_size,以便不仅可以在要素图中合并,而且可以在要素图之间进行合并?

How do I extend theanos downsample.max_pool_2d_same_size in order to pool not only within a feature map, but also between those - in a efficient manner?

假设我有3个特征图,每个特征图的大小均为10x10,即4D张量(1,3,10,10).首先让最大池((2,2),不重叠)的每个(10,10)功能图.结果是3个稀疏特征图,仍然是(10,10),但大多数值等于零:在(2,2)窗口内,最多只有一个值大于零.这就是downsample.max_pool_2d_same_size的作用.

Lets say i got 3 feature maps, each of size 10x10, that would be a 4D Tensor (1,3,10,10). First lets max pool ((2,2), no overlapping) each of the (10,10) feature map. The results are 3 sparse feature maps, still (10,10) but most values equal to zero: within a (2,2) window is at most one value greater than zero. This is what downsample.max_pool_2d_same_size does.

接下来,我想将某个(2,2)窗口的每个最大值与该窗口在相同位置的所有其他特征图的所有其他最大值进行比较. 我只想在所有要素图中都保留最大值.结果还是3个特征图(10,10),几乎所有值都是零.

Next, i want to compare every maximum of a certain (2,2) window to all other maxima of all other feature maps of the window at the same position. I want to keep only the maxima across all of the feature maps. The results are again 3 feature maps (10,10), with nearly all of the values being zero.

有没有一种快速的方法? 我不介意其他max_pooling函数,但是我需要将最大值的确切位置用于池/解池目的(但这是另一个主题).

Is there a fast way of doing so? I wouldn't mind other max_pooling functions, but i need the exact locations of the maxima for pooling/unpooling purposes (but that's another topic).

推荐答案

我使用烤宽面条和cudnn解决了这个问题.这是一些有关如何获得最大池化操作(2d和3d)的索引的最小示例.请参见 https://groups.google.com/forum/#!topic/千层面用户/BhtKsRmFei4

I solved it using lasagne with cudnn. Here are some minimal examples of how to get the indices of a max pooling operation (2d and 3d). See https://groups.google.com/forum/#!topic/lasagne-users/BhtKsRmFei4

import numpy as np
import theano
import theano.tensor as T
from theano.tensor.type import TensorType
from theano.configparser import config
import lasagne

def tensor5(name=None, dtype=None):
    if dtype is None:
        dtype = config.floatX
    type = TensorType(dtype, (False, False, False, False, False))
    return type(name)

def max_pooling_2d():
    input_var = T.tensor4('input')
    input_layer = lasagne.layers.InputLayer(shape=(None, 2, 4, 4), input_var=input_var)
    max_pool_layer = lasagne.layers.MaxPool2DLayer(input_layer, pool_size=(2, 2))

    pool_in, pool_out = lasagne.layers.get_output([input_layer, max_pool_layer])
    indices = T.grad(None, wrt=pool_in, known_grads={pool_out: T.ones_like(pool_out)})
    get_indices_fn = theano.function([input_var], indices,allow_input_downcast=True)

    data = np.random.randint(low=0, high=9, size=32).reshape((1,2,4,4))
    indices = get_indices_fn(data)
    print data, "\n\n", indices

def max_pooling_3d():
    input_var = tensor5('input')
    input_layer = lasagne.layers.InputLayer(shape=(1, 1, 2, 4, 4), input_var=input_var)
    # 5 input dimensions: (batchsize, channels, 3 spatial dimensions)
    max_pool_layer = lasagne.layers.dnn.MaxPool3DDNNLayer(input_layer, pool_size=(2, 2, 2))

    pool_in, pool_out = lasagne.layers.get_output([input_layer, max_pool_layer])
    indices = T.grad(None, wrt=pool_in, known_grads={pool_out: T.ones_like(pool_out)})
    get_indices_fn = theano.function([input_var], indices,allow_input_downcast=True)

    data = np.random.randint(low=0, high=9, size=32).reshape((1,1,2,4,4))
    indices = get_indices_fn(data)
    print data, "\n\n", indices

这篇关于Theano max_pool_3d的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆