解决超定约束系统 [英] Solving an overdetermined constraint system

查看:128
本文介绍了解决超定约束系统的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有n个实数变量(不知道,不在乎),我们称它们为X[n]. 我之间也有m >> n关系,我们称它们为R[m],形式为:

X[i] = alpha*X[j]alpha是非零的正实数,ij是不同的,但是(i, j)对不一定是唯一的(即,相同变量之间可能存在两种关系,但不同阿尔法因子)

我要尝试做的是找到一组alpha参数,这些参数可以解决某些最小二乘意义上的超定系统.理想的解决方案是最小化每个方程参数与其选择值之间的平方和,但我对以下近似值感到满意:

如果我将m个方程式转换为n个未知数的超定系统,则任何基于伪逆的数值解算器都将为我提供明显的解(全为零).因此,我目前要做的是在混合中添加另一个方程x[0] = 1(实际上任何常数都可以),并使用Moore-Penrose伪逆算式以最小二乘法求解生成的系统.尽管这试图最小化(x[0] - 1)^2的和与x[i] - alpha*x[j]的平方和,但我发现它是对我的问题的良好且数值稳定的近似值.这是一个示例:

a = 1
a = 2*b
b = 3*c
a = 5*c

在八度音阶中:

A = [
  1  0  0;
  1 -2  0;
  0  1 -3;
  1  0 -5;
]

B = [1; 0; 0; 0]

C = pinv(A) * B or better yet:
C = pinv(A)(:,1)

哪个产生abc的值:[0.99383; 0.51235; 0.19136] 这给了我以下(合理的)关系:

a = 1.9398*b
b = 2.6774*c
a = 5.1935*c

所以现在我需要在C/C ++/Java中实现它,并且我有以下问题:

是否有更快的方法来解决我的问题,或者我是否在生成超定系统和计算伪逆数的正确轨道上?

我当前的解决方案需要奇异值分解和三个矩阵乘法,考虑到m可以是5000甚至是10000,这有点多.有没有更快的方法来计算伪逆(实际上,我只需要第一个鉴于矩阵的稀疏性(除了第一行,B均为零,则不是整个矩阵)(矩阵的稀疏性(每行正好包含两个非零值,其中一个始终为一个,另一个始终为负))

您建议为此使用哪些数学库? LAPACK可以吗?

我也愿意接受任何其他建议,只要它们在数值上稳定且渐近快(比如k*n^2,其中k可能很大).

解决方案

SVD方法在数值上非常稳定,但不是很快.如果使用SVD,则LAPACK是一个很好的库.如果只是一次计算,那么它可能足够快.

如果您需要更快的算法,则可能不得不牺牲稳定性.一种可能性是使用QR因式分解.您必须仔细阅读才能看到详细信息,但是部分推理如下.如果AP = QR(其中P是置换矩阵,Q是正交矩阵,R是三角矩阵)是A的经济QR分解,则方程AX = B变为QRP ^ {-1} X = B解决方案是X = PR ^ {-1} Q ^ TB.以下八度代码使用与代码中相同的A和B对此进行了说明.

[Q,R,P] = qr(A,0)
C(P) = R \ (Q' * B)

对此的好处是,您可以通过执行稀疏QR分解来利用A的稀疏性.八度帮助中的qr函数有一些解释,但对我而言不起作用.

使用标准方程式甚至更快(但也不稳定):如果AX = B,则A ^ TAX = A ^ TB.矩阵A ^ TA是(希望)满秩的方阵,因此您可以将任何求解器用于线性方程式.八度代码:

C = (A' * A) \ (A' * B)

同样,可以通过这种方法利用稀疏性.有许多方法和库可以解决稀疏线性系统.流行的似乎是 UMFPACK .

稍后添加::我对该字段的了解不足,无法量化.整本书都写在这上面. QR可能比SVD快3或5倍,而普通方程式的速度又快两倍.对数值稳定性的影响取决于您的矩阵A.稀疏算法可以更快(比如说乘以m的因数),但是它们的计算成本和数值稳定性在很大程度上取决于问题,有时无法很好地理解. /p>

在您的用例中,我的建议是尝试使用SVD计算解决方案,看看需要花费多长时间,如果可以接受,那么就使用它(我想n = 1000大约需要一分钟, m = 10000).如果您想进一步研究它,还可以尝试QR和正态方程,看看它们有多快和有多精确.如果它们提供的解决方案与SVD大致相同,那么您可以确信它们足够精确以实现您的目的.只有当这些都太慢并且您愿意花一些时间时,才考虑稀疏算法.

I have n real number variables (don't know, don't really care), let's call them X[n]. I also have m >> n relationships between them let's call them R[m], of the form:

X[i] = alpha*X[j], alpha is a nonzero positive real number, i and j are distinct but the (i, j) pair is not necessarily unique (i.e. there can be two relationships between the same variables with a different alpha factor)

What I'm trying to do is find a set of alpha parameters that solve the overdetermined system in some least squares sense. The ideal solution would be to minimize the squared sum of differences between each equation parameter and it's chosen value, but I'm satisfied with the following approximation:

If I turn the m equations into an overdetermined system of n unknowns, any pseudo-inverse based numeric solver will give me the obvious solution (all zeroes). So what I currently do is add another equation into the mix, x[0] = 1 (actually any constant will do) and solve the generated system in the least squares sense using the Moore-Penrose pseudo-inverse. While this tries to minimize the sum of (x[0] - 1)^2 and the square sum of x[i] - alpha*x[j], I find it a good and numerically stable approximation to my problem. Here is an example:

a = 1
a = 2*b
b = 3*c
a = 5*c

in Octave:

A = [
  1  0  0;
  1 -2  0;
  0  1 -3;
  1  0 -5;
]

B = [1; 0; 0; 0]

C = pinv(A) * B or better yet:
C = pinv(A)(:,1)

Which yields the values for a, b, c: [0.99383; 0.51235; 0.19136] Which gives me the following (reasonable) relationships:

a = 1.9398*b
b = 2.6774*c
a = 5.1935*c

So right now I need to implement this in C / C++ / Java, and I have the following questions:

Is there a faster method to solve my problem, or am I on the right track with generating the overdetermined system and computing the pseudo-inverse?

My current solution requires a singular value decomposition and three matrix multiplications, which is a little much considering m can be 5000 or even 10000. Are there faster ways to compute the pseudo-inverse (actually, I only need the first column of it, not the entire matrix given that B is zero except for the first row) given the sparsity of the matrix (each row contains exactly two non-zero values, one of which is always one and the other is always negative)

What math libraries would you suggest to use for this? Is LAPACK ok?

I'm also open to any other suggestions, provided that they are numerically stable and asymptotically fast (let's say k*n^2, where k can be large).

解决方案

The SVD approach is numerically very stable but not very fast. If you use SVD, then LAPACK is a good library to use. If it's just a one-off computation, then it's probably fast enough.

If you need a substantially faster algorithm, you might have to sacrifice stability. One possibility would be to use the QR factorization. You'll have to read up on this to see the details, but part of the reasoning goes as follows. If AP = QR (where P is a permutation matrix, Q is an orthogonal matrix, and R is a triangular matrix) is the economy QR-decomposition of A, then the equation AX = B becomes Q R P^{-1} X = B and the solution is X = P R^{-1} Q^T B. The following Octave code illustrates this using the same A and B as in your code.

[Q,R,P] = qr(A,0)
C(P) = R \ (Q' * B)

The nice thing about this is that you can exploit the sparsity of A by doing a sparse QR decomposition. There is some explanation in the Octave help for the qr function but it did not work for me immediately.

Even faster (but also even less stable) is to use the normal equations: If A X = B then A^T A X = A^T B. The matrix A^T A is a square matrix of (hopefully) full rank, so you can use any solver for linear equations. Octave code:

C = (A' * A) \ (A' * B)

Again, sparsity can be exploited in this approach. There are many methods and libraries for solving sparse linear systems; a popular one seems to be UMFPACK.

Added later: I don't know enough about this field to quantify. Whole books have been written on this. Perhaps QR is about a factor 3 or 5 faster SVD and normal equations twice as fast again. The effect on the numerical stability depends on your matrix A. Sparse algorithms can be much faster (say a factor of m), but their computational cost and numerical stability depend very much on the problem, in ways that are sometimes not well understood.

In your use case, my recommendation would be to try computing the solution with the SVD, see how long it takes, and if that is acceptable then just use that (I guess it would be about a minute for n=1000 and m=10000). If you want to study it further, try also QR and normal equations and see how much faster they are and how accurate; if they give approximately the same solution as SVD then you can be pretty confident they are accurate enough for your purposes. Only if these are all too slow and you are willing to sink some time into it, look at sparse algorithms.

这篇关于解决超定约束系统的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆