如果可以激活一个以上的输出,那么,softmax层的替代品是什么? [英] What is the replace for softmax layer in case more than one output can be activated?

查看:628
本文介绍了如果可以激活一个以上的输出,那么,softmax层的替代品是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,我有CNN试图从MNIST数据集(使用Keras编写的代码)中预测数字.它有10个输出,构成softmax层.输出中只有一个是真实的(对于从0到9的每个数字都是独立的):

For example, I have CNN which tries to predict numbers from MNIST dataset (code written using Keras). It has 10 outputs, which form softmax layer. Only one of outputs can be true (independently for each digit from 0 to 9):

Real: [0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
Predicted: [0.02, 0.9, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01, 0.01]

由于softmax的定义,预测的总和等于1.0.

Sum of predicted is equal to 1.0 due to definition of softmax.

比方说,我有一个任务需要对一些可以归为几类的对象进行分类:

Let's say I have a task where I need to classify some objects that can fall in several categories:

Real: [0, 1, 0, 1, 0, 1, 0, 0, 0, 1]

因此,我需要以其他方式进行规范化.我需要一个函数,该函数给出范围[0,1]上的值,并且总和可以大于1.

So I need to normalize in some other way. I need function which gives value on range [0, 1] and which sum can be larger than 1.

我需要这样的东西:

Predicted: [0.1, 0.9, 0.05, 0.9, 0.01, 0.8, 0.1, 0.01, 0.2, 0.9]

每个数字是对象落入给定类别的概率.之后,我可以使用一些阈值(例如0.5)来区分给定对象所属的类别.

Each number is probability that object falls in given category. After that I can use some threshold like 0.5 to distinguish categories in which given object falls.

出现以下问题:

  1. 那么可以使用哪个激活功能呢?
  2. 此功能可能已经在Keras中存在了吗?
  3. 在这种情况下,也许您可​​以提出其他预测方法吗?

推荐答案

您的问题是多标签分类之一,例如,在Keras上下文中进行讨论:

Your problem is one of multi-label classification, and in the context of Keras it is discussed, for example, here: https://github.com/fchollet/keras/issues/741

简而言之,在keras中,建议的解决方案是用sigmoid层替换softmax层,并使用binary_crossentropy作为成本函数.

In short the suggested solution for it in keras is to replace the softmax layer with a sigmoid layer and use binary_crossentropy as your cost function.

该线程的一个示例:

# Build a classifier optimized for maximizing f1_score (uses class_weights)

clf = Sequential()

clf.add(Dropout(0.3))
clf.add(Dense(xt.shape[1], 1600, activation='relu'))
clf.add(Dropout(0.6))
clf.add(Dense(1600, 1200, activation='relu'))
clf.add(Dropout(0.6))
clf.add(Dense(1200, 800, activation='relu'))
clf.add(Dropout(0.6))
clf.add(Dense(800, yt.shape[1], activation='sigmoid'))

clf.compile(optimizer=Adam(), loss='binary_crossentropy')

clf.fit(xt, yt, batch_size=64, nb_epoch=300, validation_data=(xs, ys), class_weight=W, verbose=0)

preds = clf.predict(xs)

preds[preds>=0.5] = 1
preds[preds<0.5] = 0

print f1_score(ys, preds, average='macro')

这篇关于如果可以激活一个以上的输出,那么,softmax层的替代品是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆