编写一个可跟踪的R函数,该函数模仿LAPACK的dgetrf进行LU分解 [英] Write a trackable R function that mimics LAPACK's dgetrf for LU factorization

查看:326
本文介绍了编写一个可跟踪的R函数,该函数模仿LAPACK的dgetrf进行LU分解的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R核心中没有LU分解功能.尽管这种分解是solve的一个步骤,但并未明确使其可作为独立功能使用.我们可以为此编写一个R函数吗?它需要模仿LAPACK例程 dgetrf . Matrix程序包具有 lu函数,它很好,但是最好写一个可跟踪 R函数,该函数可以

There is no LU factorization function in R core. Although such factorization is a step of solve, it is not made explicitly available as a stand-alone function. Can we write an R function for this? It needs mimic LAPACK routine dgetrf. Matrix package has an lu function which is good, but it would be better if we could write a trackable R function, that can

  • 分解矩阵直到特定的列/行并返回中间结果;
  • 从中间结果继续分解到另一列/行或末尾.

此功能对于教育和调试目的都是有用的.教育的好处显而易见,因为我们可以逐列说明因式分解/高斯消去.供调试使用,这是两个示例.

This function would be useful for both educational and debugging purpose. The benefit for education is evident as we can illustrate the factorization / Gaussian elimination column by column. For debugging use, here are two examples.

R和Python中的LU分解之间不一致的结果中,有人问为什么R和Python中的LU分解会产生不同的结果结果.我们可以清楚地看到,两个软件都返回相同的第一个枢轴和第二个枢轴,而不是第三个枢轴.因此,当分解进行到第3行/第3列时,一定会有一些有趣的事情.如果我们可以检索该临时结果进行调查,那就太好了.

In Inconsistent results between LU decomposition in R and Python, it is asked why LU factorization in R and Python gives different result. We can clearly see that both software return identical 1st pivot and 2nd pivot, but not the 3rd. So there must be something interesting when the factorization proceeds to the 3rd row / column. It would be good if we could retrieve that temporary result for an investigation.

我可以稳定地反转R中具有许多小值的Vandermonde矩阵吗?对于这种类型的LU分解是不稳定的矩阵.在我的回答中,给出了一个3 x 3的矩阵作为示例.我希望solve会产生一个错误消息,抱怨U[3, 3] = 0,但是运行solve几次后,我发现solve有时是成功的.因此,对于数值研究,我想知道当因子分解进行到第二列/第二行时会发生什么.

In Can I stably invert a Vandermonde matrix with many small values in R? LU factorization is unstable for this type of matrix. In my answer, a 3 x 3 matrix is given for an example. I would expect solve to produce an error complaining U[3, 3] = 0, but running solve for a few times I find that solve is sometimes successful. So for a numerical investigation, I would like to know what happens when the factorization proceeds to the 2nd column / row.

由于该函数将用纯R代码编写,因此对于中等到较大的矩阵,预期它会变慢.但是性能不是问题,因为对于教育和调试,我们只使用一个小矩阵.

Since the function is to be written in pure R code, it is expected to be slow for a moderate to big matrix. But performance is not an issue, as for education and debugging we only use a small matrix.

dgetrf简介

LAPACK的dgetrf通过行透视:A = PLU计算LU分解.在因式分解退出时,

LAPACK's dgetrf computes LU factorization with row pivoting: A = PLU. On exit of the factorization,

  • L是一个单位下三角矩阵,存储在A的下三角部分中;
  • U是一个上三角矩阵,存储在A的上三角部分中;
  • P是作为单独的排列索引向量存储的行排列矩阵.
  • L is a unit lower triangular matrix, stored in the lower triangular part of A;
  • U is an upper triangular matrix, stored in the upper triangular part of A;
  • P is a row permutation matrix stored as a separate permutation index vector.

除非支点完全为零(不能达到一定的容差),否则应进行分解.

Unless a pivot is exactly zero (not up to some tolerance), factorization should proceed.

我从何开始

编写既不具有行透视功能又不具有暂停/继续"选项的LU因式分解并非没有挑战:

It is not challenging to write a LU factorization with neither row pivoting nor "pause / continue" option:

LU <- function (A) {

  ## check dimension
  n <- dim(A)
  if (n[1] != n[2]) stop("'A' must be a square matrix")
  n <- n[1]

  ## Gaussian elimination
  for (j in 1:(n - 1)) {

    ind <- (j + 1):n

    ## check if the pivot is EXACTLY 0
    piv <- A[j, j]
    if (piv == 0) stop(sprintf("system is exactly singular: U[%d, %d] = 0", j, j))

    l <- A[ind, j] / piv

    ## update `L` factor
    A[ind, j] <- l

    ## update `U` factor by Gaussian elimination
    A[ind, ind] <- A[ind, ind] - tcrossprod(l, A[j, ind])

    }

  A
  }

这表明在不需要旋转时可以给出正确的结果:

This is shown to give correct result when pivoting is not required:

A <- structure(c(0.923065107548609, 0.922819485189393, 0.277002309216186, 
0.532856695353985, 0.481061384081841, 0.0952619954477996, 
0.261916425777599, 0.433514681644738, 0.677919807843864, 
0.771985625848174, 0.705952850636095, 0.873727774480358, 
0.28782021952793, 0.863347264472395, 0.627262107795104, 
0.187472499441355), .Dim = c(4L, 4L))

oo <- LU(A)
oo
#          [,1]       [,2]       [,3]       [,4]
#[1,] 0.9230651  0.4810614 0.67791981  0.2878202
#[2,] 0.9997339 -0.3856714 0.09424621  0.5756036
#[3,] 0.3000897 -0.3048058 0.53124291  0.7163376
#[4,] 0.5772688 -0.4040044 0.97970570 -0.4479307

L <- diag(4)
low <- lower.tri(L)
L[low] <- oo[low]
L
#          [,1]       [,2]      [,3] [,4]
#[1,] 1.0000000  0.0000000 0.0000000    0
#[2,] 0.9997339  1.0000000 0.0000000    0
#[3,] 0.3000897 -0.3048058 1.0000000    0
#[4,] 0.5772688 -0.4040044 0.9797057    1

U <- oo
U[low] <- 0
U
#          [,1]       [,2]       [,3]       [,4]
#[1,] 0.9230651  0.4810614 0.67791981  0.2878202
#[2,] 0.0000000 -0.3856714 0.09424621  0.5756036
#[3,] 0.0000000  0.0000000 0.53124291  0.7163376
#[4,] 0.0000000  0.0000000 0.00000000 -0.4479307

Matrix软件包中的lu进行比较:

Comparison with lu from Matrix package:

library(Matrix)
rr <- expand(lu(A))
rr
#$L
#4 x 4 Matrix of class "dtrMatrix" (unitriangular)
#     [,1]       [,2]       [,3]       [,4]      
#[1,]  1.0000000          .          .          .
#[2,]  0.9997339  1.0000000          .          .
#[3,]  0.3000897 -0.3048058  1.0000000          .
#[4,]  0.5772688 -0.4040044  0.9797057  1.0000000
#
#$U
#4 x 4 Matrix of class "dtrMatrix"
#     [,1]        [,2]        [,3]        [,4]       
#[1,]  0.92306511  0.48106138  0.67791981  0.28782022
#[2,]           . -0.38567138  0.09424621  0.57560363
#[3,]           .           .  0.53124291  0.71633755
#[4,]           .           .           . -0.44793070
#
#$P
#4 x 4 sparse Matrix of class "pMatrix"
#            
#[1,] | . . .
#[2,] . | . .
#[3,] . . | .
#[4,] . . . |

现在考虑排列的A:

B <- A[c(4, 3, 1, 2), ]

LU(B)
#          [,1]         [,2]      [,3]       [,4]
#[1,] 0.5328567   0.43351468 0.8737278  0.1874725
#[2,] 0.5198439   0.03655646 0.2517508  0.5298057
#[3,] 1.7322952  -7.38348421 1.0231633  3.8748743
#[4,] 1.7318343 -17.93154011 3.6876940 -4.2504433

结果与LU(A)不同.但是,由于Matrix::lu执行行数据透视,因此lu(B)的结果仅与lu(A)的排列矩阵不同:

The result is different from LU(A). However, since Matrix::lu performs row pivoting, the result of lu(B) only differs from lu(A) in the permutation matrix:

expand(lu(B))$P
#4 x 4 sparse Matrix of class "pMatrix"
#            
#[1,] . . . |
#[2,] . . | .
#[3,] | . . .
#[4,] . | . .

推荐答案

让我们一一添加这些功能.

Let's add those features one by one.

这不太困难.

假设An x n.初始化置换索引向量pivot <- 1:n.在第j列中,我们扫描A[j:n, j]以获取最大绝对值.假设它是A[m, j].如果m > j,我们进行行交换A[m, ] <-> A[j, ].同时,我们进行排列pivot[j] <-> pivot[m].枢纽化之后,消除与不进行枢纽化的因式分解相同,因此我们可以重用函数LU的代码.

Suppose A is n x n. Initialize a permutation index vector pivot <- 1:n. At the j-th column we scan A[j:n, j] for the maximum absolute value. Suppose it is A[m, j]. If m > j we do a row exchange A[m, ] <-> A[j, ]. Meanwhile we do a permutation pivot[j] <-> pivot[m]. After pivoting, the elimination is as same as that for a factorization without pivoting, so we could reuse the code of function LU.

LUP <- function (A) {

  ## check dimension
  n <- dim(A)
  if (n[1] != n[2]) stop("'A' must be a square matrix")
  n <- n[1]

  ## LU factorization from the beginning to the end
  from <- 1
  to <- (n - 1)
  pivot <- 1:n

  ## Gaussian elimination
  for (j in from:to) {

    ## select pivot
    m <- which.max(abs(A[j:n, j]))

    ## A[j - 1 + m, j] is the pivot
    if (m > 1L) {
      ## row exchange
      tmp <- A[j, ]; A[j, ] <- A[j - 1 + m, ]; A[j - 1 + m, ] <- tmp
      tmp <- pivot[j]; pivot[j] <- pivot[j - 1 + m]; pivot[j - 1 + m] <- tmp
      }

    ind <- (j + 1):n

    ## check if the pivot is EXACTLY 0
    piv <- A[j, j]
    if (piv == 0) {
      stop(sprintf("system is exactly singular: U[%d, %d] = 0", j, j))
      }

    l <- A[ind, j] / piv

    ## update `L` factor
    A[ind, j] <- l

    ## update `U` factor by Gaussian elimination
    A[ind, ind] <- A[ind, ind] - tcrossprod(l, A[j, ind])

    }

  ## add `pivot` as an attribute and return `A`
  structure(A, pivot = pivot)

  }

问题中的尝试矩阵BLU(A)相同,只是具有附加的排列索引向量.

Trying matrix B in the question, LUP(B) is as same as LU(A) with an additional permutation index vector.

oo <- LUP(B)
#          [,1]       [,2]       [,3]       [,4]
#[1,] 0.9230651  0.4810614 0.67791981  0.2878202
#[2,] 0.9997339 -0.3856714 0.09424621  0.5756036
#[3,] 0.3000897 -0.3048058 0.53124291  0.7163376
#[4,] 0.5772688 -0.4040044 0.97970570 -0.4479307
#attr(,"pivot")
#[1] 3 4 2 1

这是一个提取LUP的实用函数:

Here is a utility function to extract L, U, P:

exLUP <- function (LUPftr) {
  L <- diag(1, nrow(LUPftr), ncol(LUPftr))
  low <- lower.tri(L)
  L[low] <- LUPftr[low]
  U <- LUPftr[1:nrow(LUPftr), ]  ## use "[" to drop attributes
  U[low] <- 0
  list(L = L, U = U, P = attr(LUPftr, "pivot"))
  }

rr <- exLUP(oo)
#$L
#          [,1]       [,2]      [,3] [,4]
#[1,] 1.0000000  0.0000000 0.0000000    0
#[2,] 0.9997339  1.0000000 0.0000000    0
#[3,] 0.3000897 -0.3048058 1.0000000    0
#[4,] 0.5772688 -0.4040044 0.9797057    1
#
#$U
#          [,1]       [,2]       [,3]       [,4]
#[1,] 0.9230651  0.4810614 0.67791981  0.2878202
#[2,] 0.0000000 -0.3856714 0.09424621  0.5756036
#[3,] 0.0000000  0.0000000 0.53124291  0.7163376
#[4,] 0.0000000  0.0000000 0.00000000 -0.4479307
#
#$P
#[1] 3 4 2 1

请注意,返回的排列索引确实适用于PA = LU(可能是教科书中使用最多的):

Note that the permutation index returned is really for PA = LU (probably the most used in textbooks):

all.equal( B[rr$P, ], with(rr, L %*% U) )
#[1] TRUE

要获取LAPACK返回的排列索引,即A = PLU中的一个,请执行order(rr$P).

To get the permutation index as returned by LAPACK, i.e., the one in A = PLU, do order(rr$P).

all.equal( B, with(rr, (L %*% U)[order(P), ]) )
#[1] TRUE


暂停/继续"选项

添加暂停/继续"功能有些棘手,因为我们需要某种方式来记录不完全因式分解的停止位置,以便稍后从那里取用.


"pause / continue" option

Adding "pause / continue" feature is a bit tricky, as we need some way to record where an incomplete factorization stops so that we can pick it up from there later.

假设我们要将功能LUP增强到一个新的LUP2.考虑添加参数to.用A[to, to]完成分解并将在A[to + 1, to + 1]下使用时,分解将停止.我们可以将此to以及临时的pivot向量存储为A的属性并返回.稍后,当我们将此临时结果传递回LUP2时,首先需要检查这些属性是否存在.如果是这样,它知道应该从哪里开始;否则,它会从头开始.

Suppose we are to enhance function LUP to a new one LUP2. Consider adding an argument to. The factorization will stop when it has done with A[to, to] and is going to work with A[to + 1, to + 1]. We can store this to, as well as the temporary pivot vector, as attributes to A and return. Later when we pass this temporary result back to LUP2, it need first check whether these attributes exist. If so it knows where it should start; otherwise it just starts right from the beginning.

LUP2 <- function (A, to = NULL) {

  ## check dimension
  n <- dim(A)
  if (n[1] != n[2]) stop("'A' must be a square matrix")
  n <- n[1]

  ## ensure that "to" has a valid value
  ## if it is not provided, set it to (n - 1) so that we complete factorization of `A`
  ## if provided, it can not be larger than (n - 1); otherwise it is reset to (n - 1)
  if (is.null(to)) to <- n - 1L
  else if (to > n - 1L) {
    warning(sprintf("provided 'to' too big; reset to maximum possible value: %d", n - 1L))
    to <- n - 1L
    }

  ## is `A` an intermediate result of a previous, unfinished LU factorization?
  ## if YES, it should have a "to" attribute, telling where the previous factorization stopped
  ## if NO, a new factorization starting from `A[1, 1]` is performed
  from <- attr(A, "to")

  if (!is.null(from)) {

    ## so we continue factorization, but need to make sure there is work to do
    from <- from + 1L
    if (from >= n) {
      warning("LU factorization of is already completed; return input as it is")
      return(A)
      }
    if (from > to) {
      stop(sprintf("please provide a bigger 'to' between %d and %d", from, n - 1L))
      }
    ## extract "pivot"
    pivot <- attr(A, "pivot")
    } else {

    ## we start a new factorization
    from <- 1
    pivot <- 1:n    

    }

  ## LU factorization from `A[from, from]` to `A[to, to]`
  ## the following code reuses function `LUP`'s code
  for (j in from:to) {

    ## select pivot
    m <- which.max(abs(A[j:n, j]))

    ## A[j - 1 + m, j] is the pivot
    if (m > 1L) {
      ## row exchange
      tmp <- A[j, ]; A[j, ] <- A[j - 1 + m, ]; A[j - 1 + m, ] <- tmp
      tmp <- pivot[j]; pivot[j] <- pivot[j - 1 + m]; pivot[j - 1 + m] <- tmp
      }

    ind <- (j + 1):n

    ## check if the pivot is EXACTLY 0
    piv <- A[j, j]
    if (piv == 0) {
      stop(sprintf("system is exactly singular: U[%d, %d] = 0", j, j))
      }

    l <- A[ind, j] / piv

    ## update `L` factor
    A[ind, j] <- l

    ## update `U` factor by Gaussian elimination
    A[ind, ind] <- A[ind, ind] - tcrossprod(l, A[j, ind])

    }

  ## update attributes of `A` and return `A`
  structure(A, to = to, pivot = pivot)
  }

尝试使用问题中的矩阵B.假设我们要在处理2列/行后停止分解.

Try with matrix B in the question. Let's say we want to stop the factorization after it has processed 2 columns / rows.

oo <- LUP2(B, 2)
#          [,1]       [,2]       [,3]      [,4]
#[1,] 0.9230651  0.4810614 0.67791981 0.2878202
#[2,] 0.9997339 -0.3856714 0.09424621 0.5756036
#[3,] 0.5772688 -0.4040044 0.52046170 0.2538693
#[4,] 0.3000897 -0.3048058 0.53124291 0.7163376
#attr(,"to")
#[1] 2
#attr(,"pivot")
#[1] 3 4 1 2

由于因子分解未完成,因此U因子不是上三角.这是一个提取它的辅助函数.

Since factorization is not complete, the U factor is not an upper triangular. Here is a helper function to extract it.

## usable for all functions: `LU`, `LUP` and `LUP2`
## for `LUP2` the attribute "to" is used;
## for other two we can simply zero the lower triangular of `A`
getU <- function (A) {
  attr(A, "pivot") <- NULL
  to <- attr(A, "to")
  if (is.null(to)) {
    A[lower.tri(A)] <- 0
    } else {
    n <- nrow(A)
    len <- (n - 1):(n - to)
    zero_ind <- sequence(len)
    offset <- seq.int(1L, by = n + 1L, length = to)
    zero_ind <- zero_ind + rep.int(offset, len)
    A[zero_ind] <- 0
    }
  A
  }

getU(oo)
#          [,1]       [,2]       [,3]      [,4]
#[1,] 0.9230651  0.4810614 0.67791981 0.2878202
#[2,] 0.0000000 -0.3856714 0.09424621 0.5756036
#[3,] 0.0000000  0.0000000 0.52046170 0.2538693
#[4,] 0.0000000  0.0000000 0.53124291 0.7163376
#attr(,"to")
#[1] 2

现在我们可以继续分解:

Now we can continue factorization:

LUP2(oo, 1)
#Error in LUP2(oo, 1) : please provide a bigger 'to' between 3 and 3

糟糕,我们错误地将了不可行的值to = 1传递给了LUP2,因为临时结果已经处理了2列/行,并且无法撤消它.该函数告诉我们,我们只能向前移动并将to设置为3到3之间的任何整数.如果传递的值大于3,将生成警告,并且to重置为最大可能值.

Oops, we have wrongly passed an infeasible value to = 1 to LUP2, because the temporary result has already processed 2 columns / rows and it can not undo it. The function tells us that we can only move forward and set to to any integers between 3 and 3. If we pass in a value larger than 3, a warning will be generated and to is reset to the maximum possible value.

oo <- LUP2(oo, 10)
#Warning message:
#In LUP2(oo, 10) :
#  provided 'to' too big; reset to maximum possible value: 3

我们有U因素

getU(oo)
#          [,1]       [,2]       [,3]       [,4]
#[1,] 0.9230651  0.4810614 0.67791981  0.2878202
#[2,] 0.0000000 -0.3856714 0.09424621  0.5756036
#[3,] 0.0000000  0.0000000 0.53124291  0.7163376
#[4,] 0.0000000  0.0000000 0.00000000 -0.4479307
#attr(,"to")
#[1] 3

oo现在是完整的分解结果.如果我们仍然要求LUP2进行更新怎么办?

The oo is now a complete factorization result. What if we still ask LUP2 to update it?

## without providing "to", it defaults to factorize till the end
oo <- LUP2(oo)
#Warning message:
#In LUP2(oo) :
#  LU factorization is already completed; return input as it is

它告诉您无法进行任何操作并按原样返回输入.

It tells you that nothing further can be done and return the input as it is.

最后,让我们尝试一个奇异的方阵.

Finally let's try a singular square matrix.

## this 4 x 4 matrix has rank 1
S <- tcrossprod(1:4, 2:5)

LUP2(S)
#Error in LUP2(S) : system is exactly singular: U[2, 2] = 0

## traceback
LUP2(S, to = 1)
#     [,1] [,2] [,3] [,4]
#[1,] 8.00   12   16   20
#[2,] 0.50    0    0    0
#[3,] 0.75    0    0    0
#[4,] 0.25    0    0    0
#attr(,"to")
#[1] 1
#attr(,"pivot")
#[1] 4 2 3 1

这篇关于编写一个可跟踪的R函数,该函数模仿LAPACK的dgetrf进行LU分解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆