tf.nn.conv2d在tensorflow中做什么? [英] What does tf.nn.conv2d do in tensorflow?

查看:245
本文介绍了tf.nn.conv2d在tensorflow中做什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在查看有关tf.nn.conv2d的tensorflow文档,此处.但是我不明白它的作用或试图达到的目的.它在文档上说

I was looking at the docs of tensorflow about tf.nn.conv2d here. But I can't understand what it does or what it is trying to achieve. It says on the docs,

#1:将过滤器展平为形状为

#1 : Flattens the filter to a 2-D matrix with shape

[filter_height * filter_width * in_channels, output_channels].

现在该怎么办?是逐元素乘法还是纯矩阵乘法?我也无法理解文档中提到的其他两点.我在下面写下了它们:

Now what does that do? Is that element-wise multiplication or just plain matrix multiplication? I also could not understand the other two points mentioned in the docs. I have written them below :

#2:从输入张量中提取图像块以形成虚拟形状

# 2: Extracts image patches from the the input tensor to form a virtual tensor of shape

[batch, out_height, out_width, filter_height * filter_width * in_channels].

#3:对于每个色块,将滤波器矩阵和图像色块向量右乘.

# 3: For each patch, right-multiplies the filter matrix and the image patch vector.

如果有人可以举一个例子,也许一段代码(非常有用)并解释那里发生的事情以及为什么这样的操作,那将真的很有帮助.

It would be really helpful if anyone could give an example, a piece of code (extremely helpful) maybe and explain what is going on there and why the operation is like this.

我尝试编码一小部分并打印出操作的形状.不过,我还是不明白.

I've tried coding a small portion and printing out the shape of the operation. Still, I can't understand.

我尝试过这样的事情:

op = tf.shape(tf.nn.conv2d(tf.random_normal([1,10,10,10]), 
              tf.random_normal([2,10,10,10]), 
              strides=[1, 2, 2, 1], padding='SAME'))

with tf.Session() as sess:
    result = sess.run(op)
    print(result)

我了解卷积神经网络的点点滴滴.我在此处进行了研究.但是在tensorflow上的实现不是我所期望的.因此它提出了一个问题.

I understand bits and pieces of convolutional neural networks. I studied them here. But the implementation on tensorflow is not what I expected. So it raised the question.

编辑: 因此,我实现了一个简单得多的代码.但是我不知道发生了什么.我的意思是结果是这样的.如果有人能告诉我是什么过程产生此输出的,那将非常有帮助.

EDIT: So, I implemented a much simpler code. But I can't figure out what's going on. I mean how the results are like this. It would be extremely helpful if anyone could tell me what process yields this output.

input = tf.Variable(tf.random_normal([1,2,2,1]))
filter = tf.Variable(tf.random_normal([1,1,1,1]))

op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
init = tf.initialize_all_variables()
with tf.Session() as sess:
    sess.run(init)

    print("input")
    print(input.eval())
    print("filter")
    print(filter.eval())
    print("result")
    result = sess.run(op)
    print(result)

输出

input
[[[[ 1.60314465]
   [-0.55022103]]

  [[ 0.00595062]
   [-0.69889867]]]]
filter
[[[[-0.59594476]]]]
result
[[[[-0.95538563]
   [ 0.32790133]]

  [[-0.00354624]
   [ 0.41650501]]]]

推荐答案

2D卷积的计算方法类似于计算

2D convolution is computed in a similar way one would calculate 1D convolution: you slide your kernel over the input, calculate the element-wise multiplications and sum them up. But instead of your kernel/input being an array, here they are matrices.

在最基本的示例中,没有填充和跨度= 1.假设您的inputkernel是:

In the most basic example there is no padding and stride=1. Let's assume your input and kernel are:

使用内核时,您将收到以下输出:,其计算方法如下:

When you use your kernel you will receive the following output: , which is calculated in the following way:

  • 14 = 4 * 1 + 3 * 0 + 1 * 1 + 2 * 2 + 1 * 1 + 0 * 0 + 1 * 0 + 2 * 0 + 4 * 1
  • 6 = 3 * 1 + 1 * 0 + 0 * 1 + 1 * 2 + 0 * 1 + 1 * 0 + 2 * 0 + 4 * 0 + 1 * 1
  • 6 = 2 * 1 + 1 * 0 + 0 * 1 + 1 * 2 + 2 * 1 + 4 * 0 + 3 * 0 + 1 * 0 + 0 * 1
  • 12 = 1 * 1 + 0 * 0 + 1 * 1 + 2 * 2 + 4 * 1 + 1 * 0 + 1 * 0 + 0 * 0 + 2 * 1

TF的 conv2d 函数分批计算卷积,并使用略有不同的卷积格式.对于输入,它是[batch, in_height, in_width, in_channels];对于内核,它是[filter_height, filter_width, in_channels, out_channels].因此,我们需要以正确的格式提供数据:

TF's conv2d function calculates convolutions in batches and uses a slightly different format. For an input it is [batch, in_height, in_width, in_channels] for the kernel it is [filter_height, filter_width, in_channels, out_channels]. So we need to provide the data in the correct format:

import tensorflow as tf
k = tf.constant([
    [1, 0, 1],
    [2, 1, 0],
    [0, 0, 1]
], dtype=tf.float32, name='k')
i = tf.constant([
    [4, 3, 1, 0],
    [2, 1, 0, 1],
    [1, 2, 4, 1],
    [3, 1, 0, 2]
], dtype=tf.float32, name='i')
kernel = tf.reshape(k, [3, 3, 1, 1], name='kernel')
image  = tf.reshape(i, [1, 4, 4, 1], name='image')

然后用以下公式计算卷积:

Afterwards the convolution is computed with:

res = tf.squeeze(tf.nn.conv2d(image, kernel, [1, 1, 1, 1], "VALID"))
# VALID means no padding
with tf.Session() as sess:
   print sess.run(res)

并且将等于我们手工计算的结果.

And will be equivalent to the one we calculated by hand.

对于带有padding/strides的示例,请在此处查看.

这篇关于tf.nn.conv2d在tensorflow中做什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆