使用梯度下降实现线性回归 [英] Implementing a linear regression using gradient descent
问题描述
I'm trying to implement a linear regression with gradient descent as explained in this article (https://towardsdatascience.com/linear-regression-using-gradient-descent-97a6c8700931). I've followed to the letter the implementation, yet my results overflow after a few iterations. I'm trying to get this result approximately: y = -0.02x + 8499.6.
代码:
package main
import (
"encoding/csv"
"fmt"
"strconv"
"strings"
)
const (
iterations = 1000
learningRate = 0.0001
)
func computePrice(m, x, c float64) float64 {
return m * x + c
}
func computeThetas(data [][]float64, m, c float64) (float64, float64) {
N := float64(len(data))
dm, dc := 0.0, 0.0
for _, dataField := range data {
x := dataField[0]
y := dataField[1]
yPred := computePrice(m, x, c)
dm += (y - yPred) * x
dc += y - yPred
}
dm *= -2/N
dc *= -2/N
return m - learningRate * dm, c - learningRate * dc
}
func main() {
data := readXY()
m, c := 0.0, 0.0
for k := 0; k < iterations; k++ {
m, c = computeThetas(data, m, c)
}
fmt.Printf("%.4fx + %.4f\n", m, c)
}
func readXY() ([][]float64) {
file := strings.NewReader(data)
reader := csv.NewReader(file)
records, err := reader.ReadAll()
if err != nil {
panic(err)
}
records = records[1:]
size := len(records)
data := make([][]float64, size)
for i, v := range records {
val1, err := strconv.ParseFloat(v[0], 64)
if err != nil {
panic(err)
}
val2, err := strconv.ParseFloat(v[1], 64)
if err != nil {
panic(err)
}
data[i] = []float64{val1, val2}
}
return data
}
var data = `km,price
240000,3650
139800,3800
150500,4400
185530,4450
176000,5250
114800,5350
166800,5800
89000,5990
144500,5999
84000,6200
82029,6390
63060,6390
74000,6600
97500,6800
67000,6800
76025,6900
48235,6900
93000,6990
60949,7490
65674,7555
54000,7990
68500,7990
22899,7990
61789,8290`
在这里它可以在GO操场上进行处理: https://play.golang.org/p/2CdNbk9_WeY
And here it can be worked on in the GO playground: https://play.golang.org/p/2CdNbk9_WeY
我需要解决什么才能获得正确的结果?
What do I need to fix to get the correct result ?
推荐答案
为什么公式对一个数据集起作用而不对另一个数据集起作用?
Why would a formula work on one data set and not another one?
除了sascha的评论外,这是查看梯度下降应用问题的另一种方法:该算法无法保证迭代产生的结果要比前一个更好,因此不一定会收敛到结果,因为:
In addition to sascha's remarks, here's another way to look at problems of this application of gradient descent: The algorithm offers no guarantee that an iteration yields a better result than the previous, so it doesn't necessarily converge to a result, because:
- 轴
m
和c
上的梯度dm
和dc
是彼此独立地处理的;m
根据dm
沿下降方向更新,同时c
同时根据dc沿下降方向更新
—但是,对于某些曲面z = f(m,c),在轴m
和c
之间的方向上的梯度可以具有相反的符号到m
和c
本身,因此,在更新m
或c
中的任何一个时,会收敛,更新两者都偏离了最佳状态. - 但是,在线性回归到点云的情况下,失败的原因更有可能是由
- The gradients
dm
anddc
in axesm
andc
are handled indepently from each other;m
is updated in the descending direction according todm
, andc
at the same time is updated in the descending direction according todc
— but, with certain curved surfaces z = f(m, c), the gradient in a direction between axesm
andc
can have the opposite sign compared tom
andc
on their own, so, while updating any one ofm
orc
would converge, updating both moves away from the optimum. - However, more likely the failure reason in this case of linear regression to a point cloud is the entirely arbitrary magnitude of the update to
m
andc
, determined by the product of an obscure learning rate and the gradient. It is quite possible that such an update oversteps a minimum for the target function, even that this is repeated with higher amplitude in each iteration.
这篇关于使用梯度下降实现线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!