3D线性回归 [英] 3D Linear Regression

查看:381
本文介绍了3D线性回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想写,鉴于点的三维空间,psented为x,y的阵列重新$ P $,在浮点Z的坐标值,输出在该空间的最佳拟合线列表中的程序。该线可/应的单位矢量的形式,并且就行的一个点。

I want to write a program that, given a list of points in 3D-space, represented as an array of x,y,z coordinates in floating point, outputs a best-fit line in this space. The line can/should be in the form of a unit vector and a point on the line.

现在的问题是,我不知道这是怎么做。最近的事情,我发现是这个的链接,但坦白地说我不知道​​如何他从方程式到方程和的时候,我们得到了矩阵我是pretty的丢失。

The problem is that I don't know how this is to be done. The closest thing I found was this link, though quite honestly I did not understand how he went from equation to equation and by the time we got to matrices I was pretty lost.

有没有简单的2D线性回归,我可以使用/有人可以解释(数学)的推广是否/如何上面链接到的方法的工作原理(什么人会做用它来计算最佳拟合线)?

Is there a generalization of simple 2D linear regression that I can use/can someone explain (mathematically) if/how the above linked-to method works (and what one would have to do to compute the best-fit line using it)?

推荐答案

有一个标准公式由

其中的结果,是大小为向量n + 1给出的系数功能最适合的数据。

Where the result, is a vector of size n + 1 giving the coefficients of the function that best fits the data.

在你当n = 3时,X是一个MX(N + 1)的矩阵称为设计矩阵 - 你的情况MX4。构造设计矩阵,你只需要复制每个数据点的坐标值(X1,X2,...)变成了一排的X和另外的,将数字1中的每一行上塔1。向量y具有与这些坐标相关联的值。术语是转置的X和X和X的转置的乘积的逆这最后一项可以是计算密集型的获得,因为反转的矩阵为O(n ^ 3),但对你N = 4,只要当n小于说5000的,没问题。

In your case n = 3. While X is a mx(n+1) matrix called the design matrix -- in your case mx4. To construct the design matrix, you simply have to copy each data point coordinate value (x1,x2,...) into a row of X and in addition, place the number 1 in column 1 on each row. The vector y has the values associated with those coordinates. The terms and are the "transpose of X" and the "inverse of the product of the transpose of X and X." That last term can be computationally intensive to obtain because to invert a matrix is O(n^3), but for you n = 4, as long as n less than say 5000, no problem.

假设你有个数据点(6,4,11)= 20,(8,5,15)= 30,(12,9,25)= 50,和(2,1,3)= 7。 在这种情况下,

Lets say you have data points (6,4,11) = 20, (8,5,15) = 30, (12,9,25) = 50, and (2,1,3) = 7. In that case,

然后你只需要乘东西出来,你可以得到直接。乘矩阵很简单,虽然比较复杂,采取矩阵的逆是相当简单的(在这里看到,例如)。然而,对于科学计算语言,如MATLAB,倍频和朱莉娅(我将说明用),这是一个需要一行代码。

Then you simply have to multiply things out and you can get directly. Multiplying matrices is straightforward and though more complicated, taking the inverse of a matrix is fairly straightforward (see here for example). However, for scientific computing languages like Matlab, Octave, and Julia (which I'll illustrate with) it's a one-liner.

julia> X = [1 6 4 11; 1 8 5 15; 1 12 9 25; 1 2 1 3]
4x4 Array{Int64,2}:
 1   6  4  11
 1   8  5  15
 1  12  9  25
 1   2  1   3

julia> y = [20;30;50;7]
4-element Array{Int64,1}:
 20
 30
 50
  7

julia> T = pinv(X'*X)*X'*y
4-element Array{Float64,1}:
  4.0
 -5.5
 -7.0
  7.0

验证...

Verifying...

julia> 12*(-5.5) + 9*(-7.0) + 25*(7) + 4
50.0

在朱莉娅,MATLAB和八度矩阵可以简单地通过使用*相乘,而转置运算符是。这里请注意,我用PINV(伪逆),这是必要的(不是这个时间),当数据过于冗余,并产生了一个非反转的的X Xtranspose,记住这一点,如果你选择自己实现矩阵求逆。

In Julia, Matlab, and Octave matrices can be multiplied simply by using *, while the transpose operator is '. Note here that I used pinv (the pseudo inverse) which is necessary (not this time) when the data is too redundant and gives rise to a non invertable X-Xtranspose, keep that in mind if you choose to implement matrix inversion yourself.

主成分分析(PCA)是维数降低的技术,目的是找到从一个n维空间中,使得所述投影误差最小的k维空间。在一般情况下,n和k是任意的,但在这种情况下,n = 3的且k = 1,有4个主要步骤。

Principal Component Analysis (PCA) is a technique for dimensionality reduction, the object is to find a k-dimensional space from an n dimensional space such that the projection error is minimized. In the general case, n and k are arbitrary, but in this case n = 3 and k = 1. There are 4 main steps.

有关的标准方法来工作,必须首先执行的意思的正常化和也可能缩放数据使得算法不从浮点错误而失败。在后一种情况下,这意味着如果一个维度的值的范围是巨大的相对于另一个可能有问题(如-1000到1000在一维与-0.1到0.2)。通常它们是足够接近的though.Mean正常化仅仅意味着对每个维,减去每个数据点的平均值,以使所得的数据组围绕原点为中心。取的结果和存储每个数据点(X1,X2,... xn)映射为一行在一个大的矩阵X.     X = [6 4 11; 8 5月15日; 12 9月25日; 2 1 3]     4x3的阵列{Int64,2}:       6 4 11       8 5月15日      12 9 25       2 1 3

For the standard method to work, one must first perform mean normalization and also possibly scale the data so that the algorithm doesn't fail from floating point error. In the latter case, that means if the range of values of one dimension are huge relative to another there could be problem (like -1000 to 1000 in one dimension versus -0.1 to 0.2). Usually they're close enough though.Mean normalization simply mean for each dimension, subtract the average from each datapoint so that the resulting data set is centered around the origin. Take the result and store each data point (x1,x2,...xn) as a row in one big matrix X. X = [ 6 4 11; 8 5 15; 12 9 25; 2 1 3] 4x3 Array{Int64,2}: 6 4 11 8 5 15 12 9 25 2 1 3

找到平均值

y = convert(Array{Float64,1},([sum(X[1:4,x]) for x = 1:3])/4')
3-element Array{Float64,1}:
  7.0 
 4.75
 13.5 

正常化...

Normalize...

julia> Xm = X .- y'
4x3 Array{Float64,2}:
 -1.0  -0.75   -2.5
  1.0   0.25    1.5
  5.0   4.25   11.5
 -5.0  -3.75  -10.5

步骤2:计算到协方差矩阵

协方差矩阵西格玛简直就是

Step 2: Calculate to covariance matrix

The covariance matrix sigma is simply

其中m是数据点的数量。

where m is the number of data points.

这是最好随便找一个库,采用协方差矩阵,并吐出了答案。有许多,这里是其中的一些; 在python 在研发,的 Java中,当然在八度,朱莉娅,MATLAB(如R)的是一个又一个班轮 SVD

Here it's best to just find a library that takes the covariance matrix and spits out the answer. There are many and here are some of them; in python in R, in Java, and of course in Octave, Julia, Matlab (like R) it's another one liner svd.

对协方差矩阵进行SVD​​

Perform SVD on the covariance matrix

(U,S,V) = svd((1/4)*Xm'*Xm);

步骤4:找到这一行

在第一个组件(K个维度,你会采取前k组件)

Step 4: Find the line

Take the first component (for k dimensions, you would take the first k components)

Ureduce = U[:,1]
3-element Array{Float64,1}:
 -0.393041
 -0.311878
 -0.865015

这是最大限度地减少投影错误的行

This is the line that minimizes the projection error

您甚至可以恢复原来的值近似,但它们都将被一字排开,并预计在同一行。连接点得到线段。

You can even recover the approximation of the original values, but they will all be lined up and projected on the same line. Connect the dots to get a line segment.

获取每个数据点的X中的减小尺寸(自1-D将各自为1的值):

Obtain the reduced dimension of each of the data points in X (since 1-D will each be 1 value):

z= Ureduce' * Xm'
1x4 Array{Float64,2}:
2.78949  -1.76853  -13.2384  12.2174

返回的其他方式;原来的值,但都躺在相同的(最佳)行

Go back the other way; the original values but all lying on the same (optimal) line

julia> (Ureduce .* z .+ y)'
4x3 Array{Float64,2}:
  5.90362  3.88002   11.0871                         6  4  11
  7.69511  5.30157   15.0298      versus             8  5  15
 12.2032   8.87875   24.9514                        12  9  25
  2.19806  0.939664   2.93176                        2  1   3

这篇关于3D线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆