线性回归对系数有约束 [英] Linear regression with constraints on the coefficients

查看：469 发布时间：2020/4/30 12:23:01 r linear-regression quadratic-programming

本文介绍了线性回归对系数有约束的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试对像这样的模型进行线性回归:

I am trying to perform linear regression, for a model like this:

Y = aX1 + bX2 + c

所以Y ~ X1 + X2

假设我有以下响应向量:

Suppose I have the following response vector:

set.seed(1)
Y <- runif(100, -1.0, 1.0)

以及以下预测变量矩阵:

And the following matrix of predictors:

X1 <- runif(100, 0.4, 1.0)
X2 <- sample(rep(0:1,each=50))
X <- cbind(X1, X2)

我要对系数使用以下约束:

I want to use the following constraints on the coefficients:

a + c >= 0  
c >= 0

所以对b没有任何约束.

So no constraint on b.

我知道可以使用glmc包来应用约束，但是我无法确定如何将其应用于约束.我也知道可以使用contr.sum，以便所有系数的总和为0，例如，但这不是我想要的. QP()似乎是另一种可能性，其中可以使用设置meq=0，以便所有系数都> = 0(同样，这里不是我的目标).

I know that the glmc package can be used to apply constraints, but I was not able to determine how to apply it for my constraints. I also know that contr.sum can be used so that all coefficients sum to 0, for example, but that is not what I want to do. solve.QP() seems like another possibility, where setting meq=0 can be used so that all coefficients are >=0 (again, not my goal here).

注意:解决方案必须能够处理响应向量Y中的NA值，例如:

Y <- runif(100, -1.0, 1.0)
Y[c(2,5,17,56,37,56,34,78)] <- NA

推荐答案

solve.QP可以传递任意线性约束，因此可以肯定地将其用于建模约束a+c >= 0和c >= 0.

solve.QP can be passed arbitrary linear constraints, so it can certainly be used to model your constraints a+c >= 0 and c >= 0.

首先，我们可以在X上添加一列1以捕获截距项，然后可以使用solve.QP复制标准线性回归:

First, we can add a column of 1's to X to capture the intercept term, and then we can replicate standard linear regression with solve.QP:

X2 <- cbind(X, 1)
library(quadprog)
solve.QP(t(X2) %*% X2, t(Y) %*% X2, matrix(0, 3, 0), c())$solution
# [1]  0.08614041  0.21433372 -0.13267403

使用问题中的样本数据，使用标准线性回归均无法满足任何约束条件.

With the sample data from the question, neither constraint is met using standard linear regression.

通过同时修改Amat和bvec参数，我们可以添加两个约束:

By modifying both the Amat and bvec parameters, we can add our two constraints:

solve.QP(t(X2) %*% X2, t(Y) %*% X2, cbind(c(1, 0, 1), c(0, 0, 1)), c(0, 0))$solution
# [1] 0.0000000 0.1422207 0.0000000

受这些限制，通过将a和c系数都设置为0来最小化平方残差.

Subject to these constraints, the squared residuals are minimized by setting the a and c coefficients to both equal 0.

通过删除有问题的观察结果，您可以像lm函数一样处理Y或X2中的缺失值.您可以将以下步骤作为预处理步骤:

You can handle missing values in Y or X2 just as the lm function does, by removing the offending observations. You might do something like the following as a pre-processing step:

has.missing <- rowSums(is.na(cbind(Y, X2))) > 0
Y <- Y[!has.missing]
X2 <- X2[!has.missing,]

这篇关于线性回归对系数有约束的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

线性回归对系数有约束 [英] Linear regression with constraints on the coefficients

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

线性回归对系数有约束 [英] Linear regression with constraints on the coefficients

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭