在TensorFlow中实施im2col [英] Implementing im2col in TensorFlow

查看：181 发布时间：2020/5/4 9:35:19 python machine-learning tensorflow neural-network conv-neural-network

本文介绍了在TensorFlow中实施im2col的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我希望在TensorFlow中实现类似于2D卷积的操作.据我了解，实现卷积的最常见方法是首先对图像应用im2col操作(请参见此处-小节"作为矩阵乘法的实现")-一种将图像转换为带有应用了内核的图像的各个块"的2D矩阵的操作作为扁平列.

I wish to implement an operation similar to 2D convolution in TensorFlow. As per my understanding, the most common approach to implementing convolution is by first applying an im2col operation to the image (see here - subsection "Implementation as Matrix Multiplication") - an operation that transforms an image into a 2D matrix with individual "chunks" of the image to which the kernel is applied as flattened columns.

换句话说，以上链接资源的摘录解释了im2col的出色表现:

In other words, this excerpt from the above linked resource explains what im2col does nicely:

[...]例如，如果输入为[227x227x3] (格式为高度x宽度x n_channels)，并且在第4步将其与11x11x3滤镜进行卷积，则我们将在输入中使用[11x11x3]个像素块，并将每个块拉伸为大小为11 * 11 * 3 = 363的列向量.在输入中以4的步幅迭代此过程将得出(227-11)/4 + 1 =沿宽度和高度的55个位置，导致尺寸为[363 x 3025]的im2col的输出矩阵X_col，其中每一列都是拉长的接收场，总共有55 * 55 = 3025.请注意，由于接受域重叠，因此输入卷中的每个数字都可以在多个不同的列中重复.

[...] For example, if the input is [227x227x3] (in the format height x width x n_channels) and it is to be convolved with 11x11x3 filters at stride 4, then we would take [11x11x3] blocks of pixels in the input and stretch each block into a column vector of size 11*11*3 = 363. Iterating this process in the input at stride of 4 gives (227-11)/4+1 = 55 locations along both width and height, leading to an output matrix X_col of im2col of size [363 x 3025], where every column is a stretched out receptive field and there are 55*55 = 3025 of them in total. Note that since the receptive fields overlap, every number in the input volume may be duplicated in multiple distinct columns.

据我从 TensorFlow文档所了解的，在内部也使用tf.nn.conv2d.

As I understand from the TensorFlow docs, that is what's done internally with tf.nn.conv2d as well.

现在，我想在TensorFlow中单独实现上述im2col操作(因为我希望能够访问此中间结果).由于这涉及以非平凡的方式复制值，因此我本人将如何为该操作构建相对有效的计算图?同样，如何实现反向操作?

Now, I would like to implement said im2col operation in TensorFlow separately (as I wish to have access to this intermediary result). As this involves copying of values in a non-trivial way, how would I build a relatively efficient computational graph for this operation myself? Similarly, how would one implement the reverse operation?

推荐答案

您可以使用 extract_image_patches .

You can easily do this using extract_image_patches.

此功能将图像的每个filter_size x filter_size色块放入深度中，从而生成[batch_size, height, width, 9]张量.

This function puts each filter_size x filter_size patch of the image into the depth yielding a [batch_size, height, width, 9] tensor.

要与tf.nn.conv2d进行比较，您可以对图像实施Sobel运算符

To compare against tf.nn.conv2d you can implement the Sobel operator for images

import tensorflow as tf
import numpy as np

image = np.arange(10 * 10 * 1).reshape(1, 10, 10, 1)

images = tf.convert_to_tensor(image.astype(np.float32))

filter_size = 3
sobel_x = tf.constant([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]], tf.float32)
sobel_x_filter = tf.reshape(sobel_x, [3, 3, 1, 1])

image_patches = tf.extract_image_patches(images,
                                         [1, filter_size, filter_size, 1],
                                         [1, 1, 1, 1], [1, 1, 1, 1],
                                         padding='SAME')


actual = tf.reduce_sum(tf.multiply(image_patches, tf.reshape(sobel_x_filter, [9])), 3, keep_dims=True)
expected = tf.nn.conv2d(images, sobel_x_filter, strides=[1, 1, 1, 1], padding='SAME')

with tf.Session() as sess:
    print sess.run(tf.reduce_sum(expected - actual))

这给您0.0，因为它们是等效的.不需要反向功能.

This gives you 0.0 as they are equivalent. This does not need a reverse function.

修改:

从TensorFlow文档中了解到，这就是完成的工作在内部也使用tf.nn.conv2d.

As I understand from the TensorFlow docs, that is what's done internally with tf.nn.conv2d as well.

不是，不是真的.例如，GPU上的TF依赖CuDNN，而

Nope, not really. TF on the GPU for example rely on CuDNN which is a more complex beast (winograd, ptx, ...). Only in some circumstances it uses the im2col approach like here on CPU and the quantized version here.

这篇关于在TensorFlow中实施im2col的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在TensorFlow中实施im2col [英] Implementing im2col in TensorFlow

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

在TensorFlow中实施im2col [英] Implementing im2col in TensorFlow

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭