tf.nn.conv2d在tensorflow中做什么? [英] What does tf.nn.conv2d do in tensorflow?
问题描述
我正在查看有关tf.nn.conv2d
的tensorflow文档,此处.但是我不明白它的作用或试图达到的目的.它在文档上说
I was looking at the docs of tensorflow about tf.nn.conv2d
here. But I can't understand what it does or what it is trying to achieve. It says on the docs,
#1:将过滤器展平为形状为
#1 : Flattens the filter to a 2-D matrix with shape
[filter_height * filter_width * in_channels, output_channels]
.
现在该怎么办?是逐元素乘法还是纯矩阵乘法?我也无法理解文档中提到的其他两点.我在下面写下了它们:
Now what does that do? Is that element-wise multiplication or just plain matrix multiplication? I also could not understand the other two points mentioned in the docs. I have written them below :
#2:从输入张量中提取图像块以形成虚拟形状
# 2: Extracts image patches from the the input tensor to form a virtual tensor of shape
[batch, out_height, out_width, filter_height * filter_width * in_channels]
.
#3:对于每个色块,将滤波器矩阵和图像色块向量右乘.
# 3: For each patch, right-multiplies the filter matrix and the image patch vector.
如果有人可以举一个例子,也许一段代码(非常有用)并解释那里发生的事情以及为什么这样的操作,那将真的很有帮助.
It would be really helpful if anyone could give an example, a piece of code (extremely helpful) maybe and explain what is going on there and why the operation is like this.
我尝试编码一小部分并打印出操作的形状.不过,我还是不明白.
I've tried coding a small portion and printing out the shape of the operation. Still, I can't understand.
我尝试过这样的事情:
op = tf.shape(tf.nn.conv2d(tf.random_normal([1,10,10,10]),
tf.random_normal([2,10,10,10]),
strides=[1, 2, 2, 1], padding='SAME'))
with tf.Session() as sess:
result = sess.run(op)
print(result)
我了解卷积神经网络的点点滴滴.我在此处进行了研究.但是在tensorflow上的实现不是我所期望的.因此它提出了一个问题.
I understand bits and pieces of convolutional neural networks. I studied them here. But the implementation on tensorflow is not what I expected. So it raised the question.
编辑: 因此,我实现了一个简单得多的代码.但是我不知道发生了什么.我的意思是结果是这样的.如果有人能告诉我是什么过程产生此输出的,那将非常有帮助.
EDIT: So, I implemented a much simpler code. But I can't figure out what's going on. I mean how the results are like this. It would be extremely helpful if anyone could tell me what process yields this output.
input = tf.Variable(tf.random_normal([1,2,2,1]))
filter = tf.Variable(tf.random_normal([1,1,1,1]))
op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
print("input")
print(input.eval())
print("filter")
print(filter.eval())
print("result")
result = sess.run(op)
print(result)
输出
input
[[[[ 1.60314465]
[-0.55022103]]
[[ 0.00595062]
[-0.69889867]]]]
filter
[[[[-0.59594476]]]]
result
[[[[-0.95538563]
[ 0.32790133]]
[[-0.00354624]
[ 0.41650501]]]]
推荐答案
2D convolution is computed in a similar way one would calculate 1D convolution: you slide your kernel over the input, calculate the element-wise multiplications and sum them up. But instead of your kernel/input being an array, here they are matrices.
在最基本的示例中,没有填充和跨度= 1.假设您的input
和kernel
是:
In the most basic example there is no padding and stride=1. Let's assume your input
and kernel
are:
使用内核时,您将收到以下输出:,其计算方法如下:
When you use your kernel you will receive the following output: , which is calculated in the following way:
- 14 = 4 * 1 + 3 * 0 + 1 * 1 + 2 * 2 + 1 * 1 + 0 * 0 + 1 * 0 + 2 * 0 + 4 * 1
- 6 = 3 * 1 + 1 * 0 + 0 * 1 + 1 * 2 + 0 * 1 + 1 * 0 + 2 * 0 + 4 * 0 + 1 * 1
- 6 = 2 * 1 + 1 * 0 + 0 * 1 + 1 * 2 + 2 * 1 + 4 * 0 + 3 * 0 + 1 * 0 + 0 * 1
- 12 = 1 * 1 + 0 * 0 + 1 * 1 + 2 * 2 + 4 * 1 + 1 * 0 + 1 * 0 + 0 * 0 + 2 * 1
TF的 conv2d 函数分批计算卷积,并使用略有不同的卷积格式.对于输入,它是[batch, in_height, in_width, in_channels]
;对于内核,它是[filter_height, filter_width, in_channels, out_channels]
.因此,我们需要以正确的格式提供数据:
TF's conv2d function calculates convolutions in batches and uses a slightly different format. For an input it is [batch, in_height, in_width, in_channels]
for the kernel it is [filter_height, filter_width, in_channels, out_channels]
. So we need to provide the data in the correct format:
import tensorflow as tf
k = tf.constant([
[1, 0, 1],
[2, 1, 0],
[0, 0, 1]
], dtype=tf.float32, name='k')
i = tf.constant([
[4, 3, 1, 0],
[2, 1, 0, 1],
[1, 2, 4, 1],
[3, 1, 0, 2]
], dtype=tf.float32, name='i')
kernel = tf.reshape(k, [3, 3, 1, 1], name='kernel')
image = tf.reshape(i, [1, 4, 4, 1], name='image')
然后用以下公式计算卷积:
Afterwards the convolution is computed with:
res = tf.squeeze(tf.nn.conv2d(image, kernel, [1, 1, 1, 1], "VALID"))
# VALID means no padding
with tf.Session() as sess:
print sess.run(res)
并且将等于我们手工计算的结果.
And will be equivalent to the one we calculated by hand.
对于带有padding/strides的示例,请在此处查看.
这篇关于tf.nn.conv2d在tensorflow中做什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!