我应该如何在Caffe Python层中使用Blob?何时进行培训? [英] How should I use blobs in a Caffe Python layer, and when does their training take place?
问题描述
我正在使用Caffe创建一个网络,为此我需要定义自己的图层.我想为此使用Python
层.
I am creating a network using Caffe, for which I need to define my own layer. I would like to use the Python
layer for this.
我的图层将包含一些学习到的参数.从此答案中,我被告知我需要为此创建一个斑点矢量.
My layer will contain some learned parameters. From this answer, I am told that I will need to create a blob vector for this.
- 此blob是否需要遵循任何规范,例如 作为尺寸等方面的约束?不管我在哪一层 可以,我可以创建一个一维的Blob,并使用任何元素,一个 层中进行任何计算的每个Blob?
- 斑点的
diff
是什么意思?据我了解,bottom
的diff
是当前层的渐变,而上一层的top
是渐变.但是,此处? - 何时训练这些参数?是否需要在图层定义中手动完成?
- Is there any specification that this blob will need to follow, such as constraints in dimensions, etc.? Irrespective of what my layer does, can I create a blob of one dimension, and use any element, one each, of the blob for any computation in the layer?
- What does the
diff
of a blob mean? From what I understand, thediff
ofbottom
is the gradient at the current layer, andtop
for the previous layer. However, what exactly is happening here? - When do these parameters get trained? Does this need to be done manually in the layer definition?
我已经在 test_python_layer.py
中看到了示例,但其中大多数没有任何参数.
I have seen the examples in test_python_layer.py
, but most of them do not have any parameters.
推荐答案
您可以根据需要添加任意数量的内部参数,并且这些参数(斑点)可以具有您希望的形状.
You can add as many internal parameters as you wish, and these parameters (Blobs) may have whatever shape you want them to be.
要添加Blob(在您图层的类中):
To add Blobs (in your layer's class):
def setup(self, bottom, top):
self.blobs.add_blob(2) # add two blobs
self.blobs[0].reshape(3, 4) # first blob is 2D
self.blobs[0].data[...] = 0 # init
self.blobs[1].reshape(10) # second blob is 1D with 10 elements
self.blobs[1].data[...] = 1 # init to 1
每个参数的含义"是什么以及如何在self.blobs
中组织它们完全取决于您.
What is the "meaning" of each parameter and how to organize them in self.blobs
is entirely up to you.
可训练参数如何被训练"?
这是有关caffe(以及其他DNN工具包)的很酷的事情之一,您不必担心!
你需要做什么?您只需要计算不带参数的损耗梯度并将其存储在self.blobs[i].diff
中即可.渐变一旦更新,caffe的内部构件就会根据渐变/学习率/动量/更新策略等来更新参数.
所以,
您必须使用不平凡的backward
方法
How are trainable parameters being "trained"?
This is one of the cool things about caffe (and other DNN toolkits as well), you don't need to worry about it!
What do you need to do? All you need is to compute the gradient of the loss w.r.t the parameters and store it in self.blobs[i].diff
. Once the gradients are updated, caffe's internals takes care of updating the parameters according to the gradients/learning rate/momentum/update policy etc.
So,
You must have a non-trivial backward
method for your layer
backward(self, top, propagate_down, bottom):
self.blobs[0].diff[...] = # diff of parameters
self.blobs[1].diff[...] = # diff for all the blobs
完成后,您可能想测试该层的实现.
请查看此PR 以进行梯度的数值测试.
You might want to test your implementation of the layer, once you complete it.
Have a look at this PR for a numerical test of the gradients.
这篇关于我应该如何在Caffe Python层中使用Blob?何时进行培训?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!