八度中的梯度下降实现 [英] Gradient Descent implementation in octave

查看：86 发布时间：2020/5/19 19:39:17 octave

本文介绍了八度中的梯度下降实现的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我实际上已经为此奋斗了大约2个月了.是什么使这些与众不同?

I've actually been struggling against this for like 2 months now. What is it that makes these different?

hypotheses= X * theta
temp=(hypotheses-y)'
temp=X(:,1) * temp
temp=temp * (1 / m)
temp=temp * alpha
theta(1)=theta(1)-temp

hypotheses= X * theta
temp=(hypotheses-y)'
temp=temp * (1 / m)
temp=temp * alpha
theta(2)=theta(2)-temp



theta(1) = theta(1) - alpha * (1/m) * ((X * theta) - y)' * X(:, 1);
theta(2) = theta(2) - alpha * (1/m) * ((X * theta) - y)' * X(:, 2);

后者有效.我只是不确定为什么..我努力理解矩阵逆的必要性.

The latter works. I'm just not sure why..I struggle to understand the need for the matrix inverse .

推荐答案

您在第二个块的第一个示例中正在做什么，您错过了步骤了吗?我假设您将X与1的向量连接在一起.

What you're doing in the first example in the second block you've missed out a step haven't you? I am assuming you concatenated X with a vector of ones.

   temp=X(:,2) * temp

最后一个示例可以工作，但可以向量化得更多，从而更加简单和有效.

The last example will work but can be vectorized even more to be more simple and efficient.

我假设您只有1个功能.它将与多个功能相同，因为发生的一切是您在X矩阵中为每个功能添加了一个额外的列.基本上，您会在x上加上一个向量，以向量化截距.

I've assumed you only have 1 feature. it will work the same with multiple features since all that happens is you add an extra column to your X matrix for each feature. Basically you add a vector of ones to x to vectorize the intercept.

您可以在一行代码中更新thetas的2x1矩阵.用x串联一个矢量，使其成为nx2矩阵，然后可以乘以theta矢量(2x1)，即(X * theta)位，从而计算h(x).

You can update a 2x1 matrix of thetas in one line of code. With x concatenate a vector of ones making it a nx2 matrix then you can calculate h(x) by multiplying by the theta vector (2x1), this is (X * theta) bit.

向量化的第二部分是对(X * theta)-y)进行转置，这将为您提供1 * n矩阵，将其乘以X(n * 2矩阵)后，基本上将两者合并在一起(h(x)- y)x0和(h(x)-y)x1.根据定义，两个theta是同时完成的.这将导致我的新theta的一个1 * 2矩阵，我再次对其进行转置以围绕该矢量翻转，使其尺寸与theta矢量相同.然后，我可以通过alpha和矢量减法用theta进行简单的标量乘法.

The second part of the vectorization is to transpose (X * theta) - y) which gives you a 1*n matrix which when multiplied by X (an n*2 matrix) will basically aggregate both (h(x)-y)x0 and (h(x)-y)x1. By definition both thetas are done at the same time. This results in a 1*2 matrix of my new theta's which I just transpose again to flip around the vector to be the same dimensions as the theta vector. I can then do a simple scalar multiplication by alpha and vector subtraction with theta.

X = data(:, 1); y = data(:, 2);
m = length(y);
X = [ones(m, 1), data(:,1)]; 
theta = zeros(2, 1);        

iterations = 2000;
alpha = 0.001;

for iter = 1:iterations
     theta = theta -((1/m) * ((X * theta) - y)' * X)' * alpha;
end

这篇关于八度中的梯度下降实现的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

八度中的梯度下降实现 [英] Gradient Descent implementation in octave

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

八度中的梯度下降实现 [英] Gradient Descent implementation in octave

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭