如何快速求解最小二乘法(不确定系统)? [英] How to solve a least squares (underdetermined system) quickly?

查看:84
本文介绍了如何快速求解最小二乘法(不确定系统)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中有一个程序,该程序正在计算大量的最小二乘解(> 10,000:通常为100,000+),分析后,这些是程序的当前瓶颈.我有一个矩阵A,其中的列向量与跨度向量相对应,而解决方案b.我正在尝试求解Ax=b的最小二乘解x.矩阵的大小通常为4xj-其中许多不是正方形(j <4),因此我正在寻找欠定系统的一般解决方案.

I have a program in R that is computing a large amount of least squares solutions (>10,000: typically 100,000+) and, after profiling, these are the current bottlenecks for the program. I have a matrix A with column vectors that correspond to spanning vectors and a solution b. I am attempting to solve for the least-squares solution x of Ax=b. The matrices are typically 4xj in size - many of them are not square (j < 4) and so general solutions to under-determined systems are what I am looking for.

主要问题:解决R中欠定系统的最快方法是什么?我有许多利用正规方程的解决方案,但我正在寻找R比以下任何一种方法都快.

The main question: What is the fastest way to solve an under-determined system in R? I have many solutions that utilize the Normal Equation but am looking for a routine in R that is faster than any of the methods below.

例如: 鉴于以下约束,请针对系统Ax = b给出的x进行求解:

For example: Solve the system for x given by Ax = b given the following constraints:

  • 不需要确定系统(通常不确定)(ncol (A) <= length(b)始终成立).因此solve(A,b)不起作用,因为求解需要平方矩阵.
  • 您可以假定t(A) %*% A(等效于crossprod(A))不是单数-在程序中已对此进行了检查
  • 您可以使用R中免费提供的任何软件包
  • 解决方案不必很漂亮-它只需要快速
  • A的上限大约为10x10,很少出现零个元素-A通常非常密集
  • The system is not necessary determined [usually under-determined] (ncol (A) <= length(b) always holds). Thus solve(A,b) does not work because solve requires a square matrix.
  • You can assume that t(A) %*% A (equivalent to crossprod(A)) is non-singular - it is checked earlier in the program
  • You can use any package freely available in R
  • The solution need not be pretty - it just needs to be fast
  • An upper-bound on size of A is reasonably 10x10 and zero elements occur infrequently - A is usually pretty dense

两个随机矩阵进行测试...

Two random matrices for testing...

A = matrix(runif(12), nrow = 4)
b = matrix(runif(4), nrow = 4)

以下所有功能均已配置.它们在这里转载:

All of the functions below have been profiled. They are reproduced here:

f1 = function(A,b)
{
  solve(t(A) %*% A, t(A) %*% b)
}
f2 = function(A,b)
{
  solve(crossprod(A), crossprod(A, b))
}
f3 = function(A,b)
{
  ginv(crossprod(A)) %*% crossprod(A,b) # From the `MASS` package
}
f4 = function(A,b)
{
  matrix.inverse(crossprod(A)) %*% crossprod(A,b) # From the `matrixcalc` package
}
f5 = function(A,b)
{
  qr.solve(crossprod(A), crossprod(A,b))
}
f6 = function(A,b)
{
  svd.inverse(crossprod(A)) %*% crossprod(A,b)
}
f7 = function(A,b)
{
  qr.solve(A,b)
}
f8 = function(A,b)
{
  Solve(A,b) # From the `limSolve` package
}

经过测试,f2是当前的获胜者.我还测试了线性模型方法-考虑到它们产生的所有其他信息,它们的运行速度非常可笑.使用以下代码分析了代码:

After testing, f2 is the current winner. I have also tested linear model methods - they were ridiculously slow given all of the other information they produce. The code was profiled using the following:

library(ggplot2)
library(microbenchmark)

all.equal(
  f1(A,b),
  f2(A,b),
  f3(A,b),
  f4(A,b),
  f5(A,b),
  f6(A,b),
  f7(A,b),
  f8(A,b),
          )

compare = microbenchmark(
  f1(A,b),
  f2(A,b),
  f3(A,b),
  f4(A,b),
  f5(A,b),
  f6(A,b),
  f7(A,b),
  f8(A,b),
  times = 1000)

autoplot(compare)

推荐答案

Rcpp怎么样?

library(Rcpp)
cppFunction(depends='RcppArmadillo', code='
            arma::mat fRcpp (arma::mat A, arma::mat b) {
            arma::mat betahat ;
            betahat = (A.t() * A ).i() * A.t() * b ;
            return(betahat) ;
            }                                
            ')

all.equal(f1(A, b), f2(A, b), fRcpp(A, b))
#[1] TRUE
microbenchmark(f1(A, b), f2(A, b), fRcpp(A, b))
#Unit: microseconds
#        expr    min     lq     mean  median      uq     max neval
#    f1(A, b) 55.110 57.136 67.42110 59.5680 63.0120 160.873   100
#    f2(A, b) 34.444 37.685 43.86145 39.7120 41.9405 117.920   100
# fRcpp(A, b)  3.242  4.457  7.67109  8.1045  8.9150  39.307   100

这篇关于如何快速求解最小二乘法(不确定系统)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆