对于张量流中的二元分类,成本函数总是返回零 [英] Cost function always returning zero for a binary classification in tensorflow

查看:27
本文介绍了对于张量流中的二元分类,成本函数总是返回零的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 tensorflow 中编写了以下二进制分类程序,但有问题.无论输入是什么,成本始终归零.我正在尝试调试一个更大的程序,它没有从数据中学习任何东西.我已经将至少一个错误缩小到成本函数总是返回零.给定的程序正在使用一些随机输入并且有同样的问题.self.X_trainself.y_train 原本应该从文件中读取,函数 self.predict() 有更多层形成前馈神经网络.

I have written the following binary classification program in tensorflow that is buggy. The cost is returning to be zero all the time no matter what the input is. I am trying to debug a larger program which is not learning anything from the data. I have narrowed down at least one bug to the cost function always returning zero. The given program is using some random inputs and is having the same problem. self.X_train and self.y_train is originally supposed to read from files and the function self.predict() has more layers forming a feedforward neural network.

import numpy as np
import tensorflow as tf

class annClassifier():

    def __init__(self):

        with tf.variable_scope("Input"):
             self.X = tf.placeholder(tf.float32, shape=(100, 11))

        with tf.variable_scope("Output"):
            self.y = tf.placeholder(tf.float32, shape=(100, 1))

        self.X_train = np.random.rand(100, 11)
        self.y_train = np.random.randint(0,2, size=(100, 1))

    def predict(self):

        with tf.variable_scope('OutputLayer'):
            weights = tf.get_variable(name='weights',
                                      shape=[11, 1],
                                      initializer=tf.contrib.layers.xavier_initializer())
            bases = tf.get_variable(name='bases',
                                    shape=[1],
                                    initializer=tf.zeros_initializer())
            final_output = tf.matmul(self.X, weights) + bases

        return final_output

    def train(self):

        prediction = self.predict()
        cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=self.y))

        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())         
            print(sess.run(cost, feed_dict={self.X:self.X_train, self.y:self.y_train}))


with tf.Graph().as_default():
    classifier = annClassifier()
    classifier.train()

如果有人能找出我在这方面做错了什么,我可以尝试在我的原始程序中进行相同的更改.非常感谢!

If someone could please figure out what I am doing wrong in this, I can try making the same change in my original program. Thanks a lot!

推荐答案

唯一的问题是使用的成本无效.softmax_cross_entropy_with_logits 如果你有多于两个类,应该使用 softmax_cross_entropy_with_logits,因为单个输出的 softmax 总是返回 1,因为它被定义为:

The only problem is invalid cost used. softmax_cross_entropy_with_logits should be used if you have more than two classes, as softmax of a single output always returns 1, as it is defined as :

softmax(x)_i = exp(x_i) / SUM_j exp(x_j)

所以对于单个数字(一维输出)

so for a single number (one dimensional output)

softmax(x) = exp(x) / exp(x) = 1

此外,对于 softmax 输出,TF 需要单热编码标签,因此如果您只提供 0 或 1,则有两种可能性:

Furthermore, for softmax output TF expects one-hot encoded labels, so if you provide only 0 or 1, there are two possibilities:

  1. 真实标签为0,所以成本为-0*log(1) = 0
  2. 真正的标签是1,所以代价是-1*log(1) = 0

Tensorflow 有一个单独的函数来处理二进制分类,它应用 sigmoid 代替(注意,对于多个输出的相同函数将在每个维度上独立应用 sigmoid,这是多标签分类所期望的):

Tensorflow has a separate function to handle binary classification which applies sigmoid instead (note, that the same function for more than one output would apply sigmoid independently on each dimension which is what multi-label classification would expect):

tf.sigmoid_cross_entropy_with_logits

只需切换到此成本,您就可以开始使用了,您也不必再将任何内容编码为 one-hot,因为此功能专为您的用例而设计.

just switch to this cost and you are good to go, you do not have to encode anything as one-hot anymore either, as this function is designed solely to be used for your use-case.

唯一缺少的一点是......你的代码没有实际的训练例程,你需要定义优化器,让它最小化损失,然后在循环中运行训练操作.在您当前的设置中,您只需尝试一遍又一遍地预测,而网络永远不会改变.

The only missing bit is that .... your code does not have actual training routine you need to define optimiser, ask it to minimise a loss and then run a train op in the loop. In your current setting you just try to predict over and over, with the network which never changes.

特别是,请参阅 关于 SO 的 Cross Entropy Jungle 问题,其中提供了更多TF(和其他库)中所有这些不同的辅助函数的详细描述,它们具有不同的要求/用例.

In particular, please refer to Cross Entropy Jungle question on SO which provides more detailed description of all these different helper functions in TF (and other libraries), which have different requirements/use cases.

这篇关于对于张量流中的二元分类,成本函数总是返回零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆