多输出多类Keras模型 [英] Multi-Output Multi-Class Keras Model

查看:151
本文介绍了多输出多类Keras模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于我拥有的每个输入,我都有一个49x2的矩阵.这是1对输入输出对的样子

For each input I have, I have a 49x2 matrix associated. Here's what 1 input-output couple looks like

input :
[Car1, Car2, Car3 ..., Car118]

output :
[[Label1 Label2]
 [Label1 Label2]
      ...
 [Label1 Label2]]

其中Label1和Label2均为LabelEncode,它们分别具有1200和1300个不同的类.

Where both Label1 and Label2 are LabelEncode and they have respectively 1200 and 1300 different classes.

只是确保这就是我们所说的多输出多类问题?

Just to make sure this is what we call a multi-output multi-class problem?

我试图展平输出,但是我担心模型无法理解所有相似的Label都共享相同的类.

I tried to flatten the output but I feared the model wouldn't understand that all similar Label share the same classes.

有Keras层可以处理这种特殊的数组形状吗?

Is there a Keras layer that handle output this peculiar array shape?

推荐答案

通常,多类问题对应于在一组类上输出概率分布的模型(通常是对实际类的单次编码进行评分)通过交叉熵).现在,无论您将其构造为一个输出,两个输出,49个输出还是49 x 2 = 98个输出,这都意味着拥有1,200 x 49 + 1,300 x 49 = 122,500个输出单位-计算机无法做到这一点处理,但也许不是最方便的事情.您可以尝试将每个类的输出作为单个(例如线性)单位,并舍入取整标签的值,但是,除非标签具有一定的数字含义(例如顺序,尺寸等),否则这不太可能

Generally, multi-class problems correspond with models outputting a probability distribution over the set of classes (that is typically scored against the one-hot encoding of the actual class through cross-entropy). Now, independently of whether you are structuring it as one single output, two outputs, 49 outputs or 49 x 2 = 98 outputs, that would mean having 1,200 x 49 + 1,300 x 49 = 122,500 output units - which is not something a computer cannot handle, but maybe not the most convenient thing to have. You could try having each class output to be a single (e.g. linear) unit and round it's value to choose the label, but, unless the labels have some numerical meaning (e.g. order, sizes, etc.), that is not likely to work.

如果输入中元素的顺序具有某些含义(也就是说,将其改组会影响输出),我想我可以通过RNN(例如LSTM或双向LSTM模型)来解决这个问题,其中有两个输出.对输出使用return_sequences=TrueTimeDistributed Dense softmax层,对于每个118个长输入,您将具有118对输出;那么您可以只使用时间样本加权来降低,例如,使用前向模型降低前69个(或者如果您使用双向模型,则可以先进行35次降低,最后34次降低),然后使用剩余的49对标签.或者,如果这对您的数据有意义(也许没有),则可以使用更高级的内容,例如 CTC (尽管Keras没有它,但我正在尝试集成也已在Keras中实现(感谢 @indraforyou )!

If the order of the elements in the input has some meaning (that is, shuffling it would affect the output), I think I'd approach the problem through an RNN, like an LSTM or a bidirectional LSTM model, with two outputs. Use return_sequences=True and TimeDistributed Dense softmax layers for the outputs, and for each 118-long input you'd have 118 pairs of outputs; then you can just use temporal sample weighting to drop, for example, the first 69 (or maybe do something like dropping the 35 first and the 34 last if you're using a bidirectional model) and compute the loss with the remaining 49 pairs of labellings. Or, if that makes sense for your data (maybe it doesn't), you could go with something more advanced like CTC (although Keras does not have it, I'm trying to integrate TensorFlow implementation into it without much sucess), which is also implemented in Keras (thanks @indraforyou)!.

如果输入中的顺序没有意义,但是输出中的顺序却有意义,那么您可以拥有一个RNN,其中您的输入是原始的118长矢量加上一对标签(每个标签都经过一次热编码),并且输出仍然是一对标签(同样是两个softmax层).这样做的想法是,在每一帧上获得49x2输出的一个行",然后将其与初始输入一起反馈到网络以获取下一个.在训练时,您将输入重复49次,并加上上一个"标签(第一个标签为空标签).

If the order in the input has no meaning but the order of the outputs does, then you could have an RNN where your input is the original 118-long vector plus a pair of labels (each one-hot encoded), and the output is again a pair of labels (again two softmax layers). The idea would be that you get one "row" of the 49x2 output on each frame, and then you feed it back to the network along with the initial input to get the next one; at training time, you would have the input repeated 49 times along with the "previous" label (an empty label for the first one).

如果没有要利用的顺序关系(即输入和输出的顺序没有特殊含义),则问题只能由最初的122,500个输出单位(加上您所有的隐藏单位)真正表示出来可能需要解决这些问题).您还可以尝试在常规网络和RNN之间使用某种中间立场,其中您有两个softmax输出,并且与118长矢量一起,包括所需输出的"id"(例如,作为49长一热编码矢量);如果49个输出中每个输出的每个标签的含义"相似或可比,则可能有效.

If there are no sequential relationships to exploit (i.e. the order of the input and the output do not have a special meaning), then the problem would only be truly represented by the initial 122,500 output units (plus all the hidden units you may need to get those right). You could also try some kind of middle ground between a regular network and a RNN where you have the two softmax outputs and, along with the 118-long vector, you include the "id" of the output that you want (e.g. as a 49-long one-hot encoded vector); if the "meaning" of each label at each of the 49 outputs is similar, or comparable, it may work.

这篇关于多输出多类Keras模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆