在python中构建自定义Caffe层 [英] Building custom Caffe layer in python

查看:163
本文介绍了在python中构建自定义Caffe层的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在解析了许多有关在Python中构建Caffe层的链接之后,我仍然很难理解一些概念.可以请别人澄清一下吗?

After parsing many links regarding building Caffe layers in Python i still have difficulties in understanding few concepts. Can please someone clarify them?

  • Blobs and weights python structure for network is explained here: Finding gradient of a Caffe conv-filter with regards to input.
  • Network and Solver structure is explained here: Cheat sheet for caffe / pycaffe?.
  • Example of defining python layer is here: pyloss.py on git.
  • Layer tests here: test layer on git.
  • Development of new layers for C++ is described here: git wiki.

我仍然想念的是:

  1. setup()方法:我应该在这里做什么?为什么在示例中我应该将底部"参数的长度与"2"进行比较?为什么应该是2?似乎不是批处理大小,因为它是任意的?据我了解,底部是blob,然后第一个维度是批处理大小?
  2. reshape()方法:据我了解,底部"输入参数是下一层的blob,顶层"参数是上一层的blob,我需要根据我的计算的输出形状对前一层进行重塑.但是,如果这些形状在传球之间不改变,而只是权重发生变化,为什么还要在每个前传球上都要这样做?
  3. reshapeforward方法的"top"输入参数索引为0.为什么我需要使用top[0].data=...top[0].input=...而不是top.data=...top.input=...?这个指数是关于什么的?如果我们不使用此顶部列表的其他部分,为什么以这种方式公开呢?我可以怀疑它或C ++主干的巧合,但最好能确切知道.
  4. reshape()方法,与:

  1. setup() method: what I should do here? Why in example I should compare the lenght of 'bottom' param with '2'? Why it should be 2? It seems not a batch size because its arbitrary? And bottom as I understand is blob, and then the first dimension is batch size?
  2. reshape() method: as I understand 'bottom' input param is blob of below layer, and 'top' param is blob of upper layer, and I need to reshape top layer according to output shape of my calculations with forward pass. But why do I need to do this every forward pass if these shapes do not change from pass to pass, only weights change?
  3. reshape and forward methods have 0 indexes for 'top' input param used. Why would I need to use top[0].data=... or top[0].input=... instead of top.data=... and top.input=...? Whats this index about? If we do not use other part of this top list, why it is exposed in this way? I can suspect its or C++ backbone coincidence, but it would be good to know exactly.
  4. reshape() method, line with:

if bottom[0].count != bottom[1].count

我在这里做什么?为什么它的尺寸又是2?我在这里算什么?为什么blob的两个部分(0和1)在某些成员(count)上的数量应该相等?

what I do here? why its dimension is 2 again? And what I am counting here? Why both part of blobs (0 and 1) should be equal in amount of some members (count)?

forward()方法,我在此行中定义:

forward() method, what I define by this line:

self.diff[...] = bottom[0].data - bottom[1].data

如果我定义它在前进路径后使用时?我们可以使用

When it is used after forward path if I define it? Can we just use

diff = bottom[0].data - bottom[1].data 

不是在以后的方法中计算损失,而没有分配给self还是出于某种目的?

instead to count loss later in this method, without assigning to self, or its done with some purpose?

backward()方法:这是关于什么的:for i in range(2):?为什么范围再次是2?

backward() method: what's this about: for i in range(2):? Why again range is 2?

很抱歉,如果这些问题太明显了,我只是无法找到一个很好的指南来理解它们并在这里寻求帮助.

I'm sorry if those questions are too obvious, I just wasn't able to find a good guide to understand them and asking for help here.

推荐答案

您在这里提出了很多问题,我将重点介绍一些指针和指针,希望它们对您有所帮助.我不会明确回答您的所有问题.

You asked a lot of questions here, I'll give you some highlights and pointers that I hope will clarify matters for you. I will not explicitly answer all your questions.

对于Blob和图层的输入/输出之间的差异,您似乎最困惑.实际上,大多数层都以 single blob作为输入,并以 single blob作为输出,但并非总是如此.考虑一个损失层:它有两个输入:预测和地面真相标签.因此,在这种情况下,bottom是长度为 2 (!)的向量,其中bottom[0]是表示预测的(4-D)Blob,而bottom[1]是另一个带有标签的Blob.因此,在构建这样的层时,您必须确定您具有(硬编码)2个输入Blob(例如,参见

It seems like you are most confused about the the difference between a blob and a layer's input/output. Indeed most of the layers has a single blob as input and a single blob as output, but it is not always the case. Consider a loss layer: it has two inputs: predictions and ground truth labels. So, in this case bottom is a vector of length 2(!) with bottom[0] being a (4-D) blob representing predictions, while bottom[1] is another blob with the labels. Thus, when constructing such a layer you must ascertain that you have exactly (hard coded) 2 input blobs (see e.g., ExactNumBottomBlobs() in AccuracyLayer definition).

top斑点也是如此:实际上,在大多数情况下,每一层都有一个top,但并非总是如此(例如,参见

The same goes for top blobs as well: indeed in most cases there is a single top for each layer, but it's not always the case (see e.g., AccuracyLayer). Therefore, top is also a vector of 4-D blobs, one for each top of the layer. Most of the time there would be a single element in that vector, but sometimes you might find more than one.

我相信这涵盖了您的问题1,3,4和6.

I believe this covers your questions 1,3,4 and 6.

reshape()(Q.2)开始,此函数未在每次正向传递时都被调用,仅在设置net为输入/输出和参数分配空间时才调用此函数.
有时,您可能想更改网络的输入大小(例如,对于检测网络),然后需要为网络的所有层调用reshape(),以适应新的输入大小.

As of reshape() (Q.2) this function is not called every forward pass, it is called only when net is setup to allocate space for inputs/outputs and params.
Occasionally, you might want to change input size for your net (e.g., for detection nets) then you need to call reshape() for all layers of the net to accommodate the new input size.

关于propagate_down参数(Q.7):由于一层可能具有多个bottom,原则上您需要将渐变传递给 all bottom s在反向传播期间.但是,损耗层的label底部的梯度是什么意思?在某些情况下,您不想传播到 all bottom s:这是此标志的作用. (这是一个示例,其中有一个损耗层,其中包含三个bottom期望它们全部渐变).

As for propagate_down parameter (Q.7): since a layer may have more than one bottom you would need, in principle, to pass the gradient to all bottoms during backprop. However, what is the meaning of a gradient to the label bottom of a loss layer? There are cases when you do not want to propagate to all bottoms: this is what this flag is for. (here's an example with a loss layer with three bottoms that expect gradient to all of them).

有关更多信息,请参见"Python"层教程.

For more information, see this "Python" layer tutorial.

这篇关于在python中构建自定义Caffe层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆