如何训练多输出深度学习模型? [英] How is a multiple-outputs deep learning model trained?

查看：746 发布时间：2020/4/25 10:12:13 keras neural-network deep-learning backpropagation multipleoutputs

本文介绍了如何训练多输出深度学习模型?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想我不理解多输出网络.

I think I do not understand the multiple-output networks.

通过我了解了实现的方式并成功地训练了这样的模型，我不了解如何训练多输出深度学习网络.我的意思是，培训期间网络内部发生了什么?

Althrough i understand how the implementation is made and i succesfully trained one model like this, i don't understand how a multiple-outputs deep learning network is trained. I mean, what is happening inside the network during training?

以 keras功能性API指南中的该网络为例:

您可以看到两个输出(aux_output和main_output).反向传播如何运作?

You can see the two outputs (aux_output and main_output). How is the backpropagation working?

我的直觉是该模型进行两次反向传播，每个输出一次. 然后，每个反向传播都会更新退出之前的图层的权重. 但似乎不正确::来自

My intuition was that the model does two backpropagations, one for each output. Each backpropagation then updates the weight of the layers preceding the exit. But it appears that's not true: from here (SO), i got the information that there is only one backpropagation despite the multiple outputs; the used loss is weighted according to the outputs.

但是，我仍然不知道如何训练网络及其辅助分支.由于未直接连接到主输出，辅助分支权重如何更新?网络中位于辅助分支的根与主输出之间的部分是否受到损失权重的影响?还是权重仅影响连接到辅助输出的网络部分?

But still, i don't get how the network and its auxiliary branch are trained; how are the auxiliary branch weights updated as it is not connected directly to the main output? Is the part of the network which is between the root of the auxiliary branch and the main output concerned by the the weighting of the loss? Or the weighting influences only the part of the network that is connected to the auxiliary output?

此外，我正在寻找有关此主题的好文章.我已经阅读过GoogLeNet/Inception文章( v1 ，

Also, i'm looking for good articles about this subject. I already read GoogLeNet / Inception articles (v1,v2-v3) as this network is using auxiliary branches.

推荐答案

Keras计算基于图形，并且仅使用一个优化程序.

Keras calculations are graph based and use only one optimizer.

优化器也是图形的一部分，在其计算中，它获得了整个权重组的梯度. (不是两组渐变，每个输出一组，而是整个模型的一组渐变).

The optimizer is also a part of the graph, and in its calculations it gets the gradients of the whole group of weights. (Not two groups of gradients, one for each output, but one group of gradients for the entire model).

从数学上讲，这并不是很复杂，您有一个由以下组成的最终损失函数:

Mathematically, it's not really complicated, you have a final loss function made of:

loss = (main_weight * main_loss) + (aux_weight * aux_loss) #you choose the weights in model.compile

全部由您定义.加上一系列其他可能的权重(样本权重，类权重，正则化条件等)

All defined by you. Plus a series of other possible weights (sample weights, class weights, regularizer terms, etc.)

位置:

main_loss是function_of(main_true_output_data, main_model_output)
aux_loss是function_of(aux_true_output_data, aux_model_output)

main_loss is a function_of(main_true_output_data, main_model_output)
aux_loss is a function_of(aux_true_output_data, aux_model_output)

所有权重的梯度仅为∂(loss)/∂(weight_i).

And the gradients are just ∂(loss)/∂(weight_i) for all weights.

优化器一旦具有渐变，它就会执行一次优化步骤.

Once the optimizer has the gradients, it performs its optimization step once.

问题:

由于辅助分支权重没有直接连接到主输出，该如何更新?

how are the auxiliary branch weights updated as it is not connected directly to the main output?

您有两个输出数据集. main_output的一个数据集，而aux_output的另一个数据集.您必须将它们传递给model.fit(inputs, [main_y, aux_y], ...)

fit

您还有两个损失函数，每个损失函数一个，其中main_loss取main_y和main_out；和aux_loss takex aux_y和aux_out.
两个损失相加:loss = (main_weight * main_loss) + (aux_weight * aux_loss)
一次为函数loss计算梯度，并且此函数连接到整个模型.
- aux项将影响反向传播中的lstm_1和embedding_1.
- 因此，在下一个正向传递(权重更新之后)中，它将最终影响主分支. (是好是坏取决于aux输出是否有用)

查看全文

如何训练多输出深度学习模型? [英] How is a multiple-outputs deep learning model trained?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何训练多输出深度学习模型? [英] How is a multiple-outputs deep learning model trained?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭