对于张量流中的二进制分类,成本函数始终返回零 [英] Cost function always returning zero for a binary classification in tensorflow

查看:66
本文介绍了对于张量流中的二进制分类,成本函数始终返回零的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在tensorflow中编写了以下有问题的二进制分类程序.无论输入什么,成本始终都为零.我正在尝试调试一个较大的程序,该程序无法从数据中学习任何东西.我将至少一个bug缩小为cost函数,始终返回零.给定的程序正在使用一些随机输入,并且存在相同的问题. self.X_trainself.y_train最初应该从文件中读取,而函数self.predict()具有更多层,从而形成前馈神经网络.

I have written the following binary classification program in tensorflow that is buggy. The cost is returning to be zero all the time no matter what the input is. I am trying to debug a larger program which is not learning anything from the data. I have narrowed down at least one bug to the cost function always returning zero. The given program is using some random inputs and is having the same problem. self.X_train and self.y_train is originally supposed to read from files and the function self.predict() has more layers forming a feedforward neural network.

import numpy as np
import tensorflow as tf

class annClassifier():

    def __init__(self):

        with tf.variable_scope("Input"):
             self.X = tf.placeholder(tf.float32, shape=(100, 11))

        with tf.variable_scope("Output"):
            self.y = tf.placeholder(tf.float32, shape=(100, 1))

        self.X_train = np.random.rand(100, 11)
        self.y_train = np.random.randint(0,2, size=(100, 1))

    def predict(self):

        with tf.variable_scope('OutputLayer'):
            weights = tf.get_variable(name='weights',
                                      shape=[11, 1],
                                      initializer=tf.contrib.layers.xavier_initializer())
            bases = tf.get_variable(name='bases',
                                    shape=[1],
                                    initializer=tf.zeros_initializer())
            final_output = tf.matmul(self.X, weights) + bases

        return final_output

    def train(self):

        prediction = self.predict()
        cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=self.y))

        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())         
            print(sess.run(cost, feed_dict={self.X:self.X_train, self.y:self.y_train}))


with tf.Graph().as_default():
    classifier = annClassifier()
    classifier.train()

如果有人可以弄清楚我在这方面做错了什么,我可以尝试在原始程序中进行相同的更改.非常感谢!

If someone could please figure out what I am doing wrong in this, I can try making the same change in my original program. Thanks a lot!

推荐答案

唯一的问题是使用的无效成本.如果您的多于多于两个类,则应使用softmax_cross_entropy_with_logits,因为单个输出的softmax始终返回1,因为它定义为:

The only problem is invalid cost used. softmax_cross_entropy_with_logits should be used if you have more than two classes, as softmax of a single output always returns 1, as it is defined as :

softmax(x)_i = exp(x_i) / SUM_j exp(x_j)

所以对于一个数字(一维输出)

so for a single number (one dimensional output)

softmax(x) = exp(x) / exp(x) = 1

此外,对于softmax输出,TF需要一个热编码的标签,因此,如果仅提供0或1,则有两种可能性:

Furthermore, for softmax output TF expects one-hot encoded labels, so if you provide only 0 or 1, there are two possibilities:

  1. 真实标签为0,所以成本为-0*log(1) = 0
  2. 真实标签为1,所以费用为-1*log(1) = 0
  1. True label is 0, so the cost is -0*log(1) = 0
  2. True label is 1, so the cost is -1*log(1) = 0

Tensorflow具有一个单独的函数来处理二进制分类,而该二进制分类改为使用Sigmoid(请注意,对于一个以上的输出,相同的函数将在每个维度上独立应用Sigmoid,这是多标签分类所期望的):

Tensorflow has a separate function to handle binary classification which applies sigmoid instead (note, that the same function for more than one output would apply sigmoid independently on each dimension which is what multi-label classification would expect):

tf.sigmoid_cross_entropy_with_logits

只需切换到这笔费用,您就可以开始使用,因为此功能是专为您的用例设计的,因此您也不必再将任何内容编码为一键式.

just switch to this cost and you are good to go, you do not have to encode anything as one-hot anymore either, as this function is designed solely to be used for your use-case.

唯一缺少的一点是....您的代码没有实际的训练例程,您需要定义优化器,要求其将损失降到最低,然后在循环中运行训练操作.在您当前的设置下,您只需尝试不断地预测,网络就永远不会改变.

The only missing bit is that .... your code does not have actual training routine you need to define optimiser, ask it to minimise a loss and then run a train op in the loop. In your current setting you just try to predict over and over, with the network which never changes.

尤其是,请参阅有关SO的交叉熵丛林问题,该问题提供了更多信息TF(和其他库)中所有这些具有不同需求/用例的不同辅助功能的详细说明.

In particular, please refer to Cross Entropy Jungle question on SO which provides more detailed description of all these different helper functions in TF (and other libraries), which have different requirements/use cases.

这篇关于对于张量流中的二进制分类,成本函数始终返回零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆