如何在R中使用QR分解计算最小二乘估计量的方差? [英] How to calculate variance of least squares estimator using QR decomposition in R?

查看:603
本文介绍了如何在R中使用QR分解计算最小二乘估计量的方差?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习QR分解,但是在不依靠传统矩阵计算的情况下,无法弄清楚如何获得beta_hat的方差.我正在使用iris数据集进行练习,这是到目前为止的内容:

I'm trying to learn QR decomposition, but can't figure out how to get the variance of beta_hat without resorting to traditional matrix calculations. I'm practising with the iris data set, and here's what I have so far:

y<-(iris$Sepal.Length)
x<-(iris$Sepal.Width)
X<-cbind(1,x)
n<-nrow(X)
p<-ncol(X)
qr.X<-qr(X)
b<-(t(qr.Q(qr.X)) %*% y)[1:p]
R<-qr.R(qr.X)
beta<-as.vector(backsolve(R,b))
res<-as.vector(y-X %*% beta)

感谢您的帮助!

推荐答案

设置(复制代码)

y <- iris$Sepal.Length
x <- iris$Sepal.Width
X <- cbind(1,x)
n <- nrow(X)
p <- ncol(X)
qr.X <- qr(X)
b <- (t(qr.Q(qr.X)) %*% y)[1:p]  ## can be optimized; see Remark 1 below
R <- qr.R(qr.X)  ## can be optimized; see Remark 2 below
beta <- as.vector(backsolve(R, b))
res <- as.vector(y - X %*% beta)

数学

剩余自由度为n - p,因此估计方差为

Residual degree of freedom is n - p, so estimated variance is

se2 <- sum(res ^ 2) / (n - p)

因此,估计系数的方差协方差矩阵为

Thus, the variance covariance matrix of estimated coefficients is

V <- chol2inv(R) * se2

#           [,1]         [,2]
#[1,]  0.22934170 -0.07352916
#[2,] -0.07352916  0.02405009

验证

让我们通过与lm进行比较来检查其正确性:

validation

Let's check the correctness by comparing with lm:

fit <- lm(Sepal.Length ~ Sepal.Width, iris)

vcov(fit)

#            (Intercept) Sepal.Width
#(Intercept)  0.22934170 -0.07352916
#Sepal.Width -0.07352916  0.02405009

完全一样的结果!

您可以使用函数qr.qty代替b <- (t(qr.Q(qr.X)) %*% y)[1:p](以避免形成'Q'矩阵):

Instead of b <- (t(qr.Q(qr.X)) %*% y)[1:p], you can use function qr.qty (to avoid forming 'Q' matrix):

b <- qr.qty(qr.X, y)[1:p]

注释2(跳过形成"R"因子)

您不必为backsolve提取R <- qr.R(qr.X);使用qr.X$qr就足够了:

Remark 2 (skip forming 'R' factor)

You don't have to extract R <- qr.R(qr.X) for backsolve; using qr.X$qr is sufficient:

beta <- as.vector(backsolve(qr.X$qr, b))


附录:一种估算功能

以上是最简单的演示.在实践中,需要处理列旋转和秩不足的问题.以下是一个实现. X是模型矩阵,y是响应.结果应与lm(y ~ X + 0)进行比较.


Appendix: A function for estimation

The above is the simplest demonstration. In practice column pivoting and rank-deficiency need be dealt with. The following is an implementation. X is a model matrix and y is the response. Results should be compared with lm(y ~ X + 0).

qr_estimation <- function (X, y) {
  ## QR factorization
  QR <- qr(X)
  r <- QR$rank
  piv <- QR$pivot[1:r]
  ## estimate identifiable coefficients
  b <- qr.qty(QR, y)[1:r]
  beta <- backsolve(QR$qr, b, r)
  ## fitted values
  yhat <- base::c(X[, piv] %*% beta)
  ## residuals
  resi <- y - yhat
  ## error variance
  se2 <- base::c(crossprod(resi)) / (nrow(X) - r)
  ## variance-covariance for coefficients
  V <- chol2inv(QR$qr, r) * se2
  ## post-processing on pivoting and rank-deficiency
  p <- ncol(X)
  beta_full <- rep.int(NA_real_, p)
  beta_full[piv] <- beta
  V_full <- matrix(NA_real_, p, p)
  V_full[piv, piv] <- V
  ## return
  list(coefficients = beta_full, vcov = V_full,
       fitted.values = yhat, residuals = resi, sig = sqrt(se2))
  }

这篇关于如何在R中使用QR分解计算最小二乘估计量的方差?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆