为什么Logistic回归成本变为负数并且不正确? [英] Why does the Logistic Regression cost go negative and not correct?

查看:544
本文介绍了为什么Logistic回归成本变为负数并且不正确?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Matlab中实现逻辑回归.数据已标准化(均值和标准差).我了解,根据您的学习速度,您可能会超出最佳点.但这是否意味着您的费用开始上涨?就我而言,成本变成了负数,我不明白为什么.

I am implementing logistic regression in Matlab. The data is normalized (mean and std). I understand that depending on your learning rate you may overshoot the optimal point. But doesn't that mean your cost starts going up? In my case the cost goes into negative territory, I don't understand why.

这是我认为是标准的成本和重量更新规则

Here is the standard (I think?) cost and weight update rule

function J = crossEntropyError(w, x, y)
  h = sigmoid(x*w);
  J = (-y'*log(h) - (1-y')*log(1-h));
end

体重更新:

function w = updateWeights(alpha, w, x, y)      
  h = sigmoid(x*w);
  gradient = x'*(h-y);
  w = w - alpha*gradient;
end

这就是我的费用,x轴是迭代:

This is what happens with my cost, x-axis is the iteration:

这没有任何意义.达到0时,它是否应该自我校正并朝另一个方向前进?即,由于导数指向最小值.我一直在按照学习率进行游戏,此处将其设置为很小的0.0001.但这没什么区别,相同的模式.怎么了这里肯定有什么不对劲,尽管我找不到.

This makes no sense. When hitting 0, isn't it supposed to self-correct and go in the other direction? That is, since the derivative points to the minimum. I've played with the learning rate, here it's set to a tiny 0.0001. But it makes no difference, same pattern. What's the issue? There must be something really wrong here, I can't find it though.

推荐答案

所以我意识到了自己的错误,这很愚蠢.我正在使用标签不是布尔值(0或1)的数据集,这导致上面的交叉熵错误.该代码是正确的,但不适用于具有非布尔数据的标签.

So I realized my mistake, it's quite silly. I was using a dataset where the labels are not boolean (0 or 1) which resulted in the cross entropy error above. The code was correct, but not suitable for labels with non-boolean data.

我将删除问题,但我不想阻止我的帐户.也许它可以帮助某人?

I would delete the question but I wouldn't want my account to be blocked. Maybe it can help someone?

这篇关于为什么Logistic回归成本变为负数并且不正确?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆