由qr.Q()迷惑:什么是“紧凑"矩阵中的正交矩阵?形式? [英] Mystified by qr.Q(): what is an orthonormal matrix in "compact" form?

查看:147
本文介绍了由qr.Q()迷惑:什么是“紧凑"矩阵中的正交矩阵?形式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R具有qr()函数,该函数使用LINPACK或LAPACK执行QR分解(以我的经验,后者快5%).返回的主要对象是矩阵"qr",该矩阵包含在上三角矩阵R中(即R=qr[upper.tri(qr)]).到目前为止,一切都很好. qr的下部三角形部分包含紧凑形式"的Q.可以使用qr.Q()从qr分解中提取Q.我想找到qr.Q()的反函数.换句话说,我确实有Q和R,并想将它们放在"qr"对象中. R是微不足道的,但Q不是.我们的目标是将其应用于qr.solve(),这在大型系统上要比solve()快得多.

解决方案

简介

R使用 LINPACK LAPACK DGEQP3 例程(如果指定),用于计算QR分解.这两个例程都使用Householder反射来计算分解.将mxn矩阵A分解为mxn经济规模正交矩阵(Q)和nxn上三角矩阵(R),如A = QR,其中Q可以通过t家用反射矩阵的乘积来计算,t较小的m-1和n:Q = H 1 H 2 ... H t .

每个反射矩阵H i 可以由长度(m-i + 1)向量表示.例如,H 1 需要长度为m的向量以进行紧凑存储.该向量中除一个外的所有条目都放置在输入矩阵下三角的第一列中(R因子使用对角线).因此,每个反射都需要一个以上的标量存储,这是由一个辅助矢量(在R的qr结果中称为$qraux)提供的.

在LINPACK和LAPACK例程之间使用的紧凑表示形式有所不同.

LINPACK方式

按照H i = I-v i v i T /p i ,其中I是单位矩阵,p i $qraux中的对应条目,而v i 如下:

  • v i [1..i-1] = 0,
  • v i [i] = p i
  • v i [i + 1:m] = A [i + 1..m,i](即,调用qr后A的下三角的列)

LINPACK示例

让我们通过R的Wikipedia中的来自QR分解文章的示例进行研究. /p>

要分解的矩阵是

> A <- matrix(c(12, 6, -4, -51, 167, 24, 4, -68, -41), nrow=3)
> A
     [,1] [,2] [,3]
[1,]   12  -51    4
[2,]    6  167  -68
[3,]   -4   24  -41

我们进行分解,结果的最相关部分如下所示:

> Aqr = qr(A)
> Aqr
$qr
            [,1]         [,2] [,3]
[1,] -14.0000000  -21.0000000   14
[2,]   0.4285714 -175.0000000   70
[3,]  -0.2857143    0.1107692  -35

[snip...]

$qraux
[1]  1.857143  1.993846 35.000000

[snip...]

(在幕后)通过计算两个Householder反射并将其乘以A以获得R,完成了分解.我们现在将根据$qr中的信息重新创建反射.

> p = Aqr$qraux   # for convenience
> v1 <- matrix(c(p[1], Aqr$qr[2:3,1]))
> v1
           [,1]
[1,]  1.8571429
[2,]  0.4285714
[3,] -0.2857143

> v2 <- matrix(c(0, p[2], Aqr$qr[3,2]))
> v2
          [,1]
[1,] 0.0000000
[2,] 1.9938462
[3,] 0.1107692

> I = diag(3)   # identity matrix
> H1 = I - v1 %*% t(v1)/p[1]   # I - v1*v1^T/p[1]
> H2 = I - v2 %*% t(v2)/p[2]   # I - v2*v2^T/p[2]

> Q = H1 %*% H2
> Q
           [,1]       [,2]        [,3]
[1,] -0.8571429  0.3942857  0.33142857
[2,] -0.4285714 -0.9028571 -0.03428571
[3,]  0.2857143 -0.1714286  0.94285714

现在让我们验证上面计算的Q是否正确:

> qr.Q(Aqr)
           [,1]       [,2]        [,3]
[1,] -0.8571429  0.3942857  0.33142857
[2,] -0.4285714 -0.9028571 -0.03428571
[3,]  0.2857143 -0.1714286  0.94285714

看起来不错!我们还可以验证QR等于A.

> R = qr.R(Aqr)   # extract R from Aqr$qr
> Q %*% R
     [,1] [,2] [,3]
[1,]   12  -51    4
[2,]    6  167  -68
[3,]   -4   24  -41

LAPACK方式

根据H i = I-p i v i v i T ,其中I是单位矩阵,p i $qraux中的对应条目,而v i 如下:

  • v i [1..i-1] = 0,
  • v i [i] = 1
  • v i [i + 1:m] = A [i + 1..m,i](即,调用qr后A的下三角的列)

在R中使用LAPACK例程时,存在另一种扭曲:使用了列透视,因此分解解决了另一个相关的问题:AP = QR,其中P是

注意$pivot字段;我们将回到这一点.现在我们从Aqr信息中生成Q.

> p = Bqr$qraux   # for convenience
> v1 = matrix(c(1, Bqr$qr[2:3,1]))
> v1
           [,1]
[1,]  1.0000000
[2,] -0.7348557
[3,] -0.1056080


> v2 = matrix(c(0, 1, Bqr$qr[3,2]))
> v2
          [,1]
[1,] 0.0000000
[2,] 1.0000000
[3,] 0.6859203


> H1 = I - p[1]*v1 %*% t(v1)   # I - p[1]*v1*v1^T
> H2 = I - p[2]*v2 %*% t(v2)   # I - p[2]*v2*v2^T
> Q = H1 %*% H2
           [,1]        [,2]       [,3]
[1,] -0.2893527 -0.46821615 -0.8348944
[2,]  0.9474882 -0.01602261 -0.3193891
[3,]  0.1361660 -0.88346868  0.4482655

再一次,上面计算的Q与R提供的Q一致.

> qr.Q(Bqr)
           [,1]        [,2]       [,3]
[1,] -0.2893527 -0.46821615 -0.8348944
[2,]  0.9474882 -0.01602261 -0.3193891
[3,]  0.1361660 -0.88346868  0.4482655

最后,让我们计算QR.

> R = qr.R(Bqr)
> Q %*% R
     [,1] [,2] [,3]
[1,]  -51    4   12
[2,]  167  -68    6
[3,]   24  -41   -4

注意到有区别吗? QR是A,其列按上面Bqr$pivot中的顺序排列.

R has a qr() function, which performs QR decomposition using either LINPACK or LAPACK (in my experience, the latter is 5% faster). The main object returned is a matrix "qr" that contains in the upper triangular matrix R (i.e. R=qr[upper.tri(qr)]). So far so good. The lower triangular part of qr contains Q "in compact form". One can extract Q from the qr decomposition by using qr.Q(). I would like to find the inverse of qr.Q(). In other word, I do have Q and R, and would like to put them in a "qr" object. R is trivial but Q is not. The goal is to apply to it qr.solve(), which is much faster than solve() on large systems.

Introduction

R uses the LINPACK dqrdc routine, by default, or the LAPACK DGEQP3 routine, when specified, for computing the QR decomposition. Both routines compute the decomposition using Householder reflections. An m x n matrix A is decomposed into an m x n economy-size orthogonal matrix (Q) and an n x n upper triangular matrix (R) as A = QR, where Q can be computed by the product of t Householder reflection matrices, with t being the lesser of m-1 and n: Q = H1H2...Ht.

Each reflection matrix Hi can be represented by a length-(m-i+1) vector. For example, H1 requires a length-m vector for compact storage. All but one entry of this vector is placed in the first column of the lower triangle of the input matrix (the diagonal is used by the R factor). Therefore, each reflection needs one more scalar of storage, and this is provided by an auxiliary vector (called $qraux in the result from R's qr).

The compact representation used is different between the LINPACK and LAPACK routines.

The LINPACK Way

A Householder reflection is computed as Hi = I - viviT/pi, where I is the identity matrix, pi is the corresponding entry in $qraux, and vi is as follows:

  • vi[1..i-1] = 0,
  • vi[i] = pi
  • vi[i+1:m] = A[i+1..m, i] (i.e., a column of the lower triangle of A after calling qr)

LINPACK Example

Let's work through the example from the QR decomposition article at Wikipedia in R.

The matrix being decomposed is

> A <- matrix(c(12, 6, -4, -51, 167, 24, 4, -68, -41), nrow=3)
> A
     [,1] [,2] [,3]
[1,]   12  -51    4
[2,]    6  167  -68
[3,]   -4   24  -41

We do the decomposition, and the most relevant portions of the result is shown below:

> Aqr = qr(A)
> Aqr
$qr
            [,1]         [,2] [,3]
[1,] -14.0000000  -21.0000000   14
[2,]   0.4285714 -175.0000000   70
[3,]  -0.2857143    0.1107692  -35

[snip...]

$qraux
[1]  1.857143  1.993846 35.000000

[snip...]

This decomposition was done (under the covers) by computing two Householder reflections and multiplying them by A to get R. We will now recreate the reflections from the information in $qr.

> p = Aqr$qraux   # for convenience
> v1 <- matrix(c(p[1], Aqr$qr[2:3,1]))
> v1
           [,1]
[1,]  1.8571429
[2,]  0.4285714
[3,] -0.2857143

> v2 <- matrix(c(0, p[2], Aqr$qr[3,2]))
> v2
          [,1]
[1,] 0.0000000
[2,] 1.9938462
[3,] 0.1107692

> I = diag(3)   # identity matrix
> H1 = I - v1 %*% t(v1)/p[1]   # I - v1*v1^T/p[1]
> H2 = I - v2 %*% t(v2)/p[2]   # I - v2*v2^T/p[2]

> Q = H1 %*% H2
> Q
           [,1]       [,2]        [,3]
[1,] -0.8571429  0.3942857  0.33142857
[2,] -0.4285714 -0.9028571 -0.03428571
[3,]  0.2857143 -0.1714286  0.94285714

Now let's verify the Q computed above is correct:

> qr.Q(Aqr)
           [,1]       [,2]        [,3]
[1,] -0.8571429  0.3942857  0.33142857
[2,] -0.4285714 -0.9028571 -0.03428571
[3,]  0.2857143 -0.1714286  0.94285714

Looks good! We can also verify QR is equal to A.

> R = qr.R(Aqr)   # extract R from Aqr$qr
> Q %*% R
     [,1] [,2] [,3]
[1,]   12  -51    4
[2,]    6  167  -68
[3,]   -4   24  -41

The LAPACK Way

A Householder reflection is computed as Hi = I - piviviT, where I is the identity matrix, pi is the corresponding entry in $qraux, and vi is as follows:

  • vi[1..i-1] = 0,
  • vi[i] = 1
  • vi[i+1:m] = A[i+1..m, i] (i.e., a column of the lower triangle of A after calling qr)

There is another twist when using the LAPACK routine in R: column pivoting is used, so the decomposition is solving a different, related problem: AP = QR, where P is a permutation matrix.

LAPACK Example

This section does the same example as before.

> A <- matrix(c(12, 6, -4, -51, 167, 24, 4, -68, -41), nrow=3)
> Bqr = qr(A, LAPACK=TRUE)
> Bqr
$qr
            [,1]        [,2]       [,3]
[1,] 176.2554964 -71.1694118   1.668033
[2,]  -0.7348557  35.4388886  -2.180855
[3,]  -0.1056080   0.6859203 -13.728129

[snip...]

$qraux
[1] 1.289353 1.360094 0.000000

$pivot
[1] 2 3 1

attr(,"useLAPACK")
[1] TRUE

[snip...]

Notice the $pivot field; we will come back to that. Now we generate Q from the information the Aqr.

> p = Bqr$qraux   # for convenience
> v1 = matrix(c(1, Bqr$qr[2:3,1]))
> v1
           [,1]
[1,]  1.0000000
[2,] -0.7348557
[3,] -0.1056080


> v2 = matrix(c(0, 1, Bqr$qr[3,2]))
> v2
          [,1]
[1,] 0.0000000
[2,] 1.0000000
[3,] 0.6859203


> H1 = I - p[1]*v1 %*% t(v1)   # I - p[1]*v1*v1^T
> H2 = I - p[2]*v2 %*% t(v2)   # I - p[2]*v2*v2^T
> Q = H1 %*% H2
           [,1]        [,2]       [,3]
[1,] -0.2893527 -0.46821615 -0.8348944
[2,]  0.9474882 -0.01602261 -0.3193891
[3,]  0.1361660 -0.88346868  0.4482655

Once again, the Q computed above agrees with the R-provided Q.

> qr.Q(Bqr)
           [,1]        [,2]       [,3]
[1,] -0.2893527 -0.46821615 -0.8348944
[2,]  0.9474882 -0.01602261 -0.3193891
[3,]  0.1361660 -0.88346868  0.4482655

Finally, let's compute QR.

> R = qr.R(Bqr)
> Q %*% R
     [,1] [,2] [,3]
[1,]  -51    4   12
[2,]  167  -68    6
[3,]   24  -41   -4

Notice the difference? QR is A with its columns permuted given the order in Bqr$pivot above.

这篇关于由qr.Q()迷惑:什么是“紧凑"矩阵中的正交矩阵?形式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆