使用SVD求解欠定的scipy.sparse矩阵 [英] Solving an underdetermined scipy.sparse matrix using svd

查看:173
本文介绍了使用SVD求解欠定的scipy.sparse矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题

我有一组方程,其中变量用小写字母表示,常数用大写字母表示

A = a + b  
B = c + d  
C = a + b + c + d + e  

我在两列的熊猫数据框中提供了有关这些方程式结构的信息:常量变量

例如

df = pd.DataFrame([['A','a'],['A','b'],['B','c'],['B','d'],['C','a'],['C','b'], 
['C','c'],['C','d'],['C','e']],columns=['Constants','Variables'])

然后我使用NetworkX将其转换为稀疏CSC矩阵

table = nx.bipartite.biadjacency_matrix(nx.from_pandas_dataframe(df,'Constants','Variables')  
,df.Constants.unique(),df.Variables.unique(),format='csc')

当转换为密集矩阵时,如下所示

matrix([[1,1,0,0,0],[0,0,1,1,0],[1,1,1,1,1,1]],dtype = int64)

我从这里想要找到的是可求解的变量(在此示例中,仅 e 是可求解的),对于每个可求解的变量,其值所依赖的常量是什么(在这种情况下,因为 e = C - B - A ,所以它依赖于 A B C )

尝试解决方案

我首先尝试使用rref来解决可解决的变量.我使用了符号库sympy和函数sympy.Matrix.rref,它们恰好满足了我的要求,因为任何可解变量都有自己的行,几乎所有的零和1,可以逐行检查. /p>

但是,该解决方案不稳定.首先,它非常慢,并且没有利用我的数据集可能非常稀疏的事实.而且,rref在浮点运算方面做得不太好.因此,我决定继续采用另一种方法,该方法由激发来自未确定的系统,建议使用svd

方便地,scipy.sparse库中有一个svd函数,即scipy.sparse.linalg.svds.但是,由于我缺乏线性代数背景,因此我不了解通过在表上运行此函数所输出的结果,也无法理解如何使用这些结果来获得所需的结果.

问题的更多详细信息

  1. 我的问题中每个变量的系数为1.这就是如何在前面显示的两列pandas DataFrame中表达数据的方式
  2. 在我的实际示例中,绝大多数变量是无法解决的.目的是找到少数可解决的问题
  3. 如果符合此问题的约束条件,我非常愿意尝试一种替代方法.

这是我第一次发布问题,因此,如果这不完全符合指导原则,我深表歉意.请留下建设性的批评,但要保持温柔!

解决方案

您要解决的系统具有格式

[ 1 1 0 0 0 ] [a]   [A]
[ 0 0 1 1 0 ] [b] = [B]
[ 1 1 1 1 1 ] [c]   [C]
              [d]
              [e]

,即五个变量a, b, c, d, e的三个方程.正如您问题中提到的答案所提到的那样,您可以使用 pseudoinverse来解决这种不确定的系统,这是Numpy根据 pinv 函数.

由于M具有线性独立的行,因此在这种情况下,伪逆具有M.pinv(M) = I的属性,其中I表示单位矩阵(在这种情况下为3x3).因此,正式地,我们可以将解决方案写为:

v = pinv(M) . b

其中,v是5分量解向量,b表示右侧3分量向量[A, B, C].但是,这种解决方案并不是唯一的,因为可以从所谓的内核或矩阵M的空空间(即向量w,其中M.w=0),它仍然是一个解决方案:

M.(v + w) = M.v + M.w = b + 0 = b

因此,唯一具有唯一解决方案的变量是那些来自M的空空间的所有可能矢量的对应分量为零的变量.换句话说,如果将零空间的基础组合成一个矩阵(每列一个基础向量),则可解变量"将对应于该矩阵的零行(列的任何线性组合的对应分量将然后也为零).

让我们将其应用于您的特定示例:

import numpy as np
from numpy.linalg import pinv

M = [
    [1, 1, 0, 0, 0],
    [0, 0, 1, 1, 0],
    [1, 1, 1, 1, 1]
]

print(pinv(M))

[[ 5.00000000e-01 -2.01966890e-16  1.54302378e-16]
 [ 5.00000000e-01  1.48779676e-16 -2.10806254e-16]
 [-8.76351626e-17  5.00000000e-01  8.66819360e-17]
 [-2.60659800e-17  5.00000000e-01  3.43000417e-17]
 [-1.00000000e+00 -1.00000000e+00  1.00000000e+00]]

从这个伪逆中,我们看到变量e(最后一行)确实可以表示为- A - B + C.但是,它也预测" a=A/2b=A/2.为了消除这些非唯一的解决方案(例如,同样有效的a=Ab=0),让我们通过从SciPy

[[ 0.5 -0.5  0.5]
 [ 0.5  0.5 -0.5]
 [-0.5  0.5  0.5]]

对应于解决方案a = (A - B + C)/2, ....由于M是可逆的,因此其内核/空空间为空,这就是cookbook函数仅返回[]的原因.为了看到这一点,让我们使用内核的定义-它由所有非零向量x组成,例如M.x = 0.但是,由于存在M^{-1},因此将x指定为x = M^{-1} . 0 = 0.从形式上讲,这意味着找到的解决方案是唯一的(或者所有变量都是可求解的").

Problem

I have a set of equations with variables denoted with lowercase variables and constants with uppercase variables as such

A = a + b  
B = c + d  
C = a + b + c + d + e  

I'm provided the information as to the structure of these equations in a pandas DataFrame with two columns: Constants and Variables

E.g.

df = pd.DataFrame([['A','a'],['A','b'],['B','c'],['B','d'],['C','a'],['C','b'], 
['C','c'],['C','d'],['C','e']],columns=['Constants','Variables'])

I then convert this to a sparse CSC matrix by using NetworkX

table = nx.bipartite.biadjacency_matrix(nx.from_pandas_dataframe(df,'Constants','Variables')  
,df.Constants.unique(),df.Variables.unique(),format='csc')

When converted to a dense matrix, table looks like the following

matrix([[1, 1, 0, 0, 0],[0, 0, 1, 1, 0],[1, 1, 1, 1, 1]], dtype=int64)

What I want from here is to find which variables are solvable (in this example, only e is solvable) and for each solvable variable, what constants is its value dependent on (in this case, since e = C-B-A, it is dependent on A, B, and C)

Attempts at Solution

I first tried to use rref to solve for the solvable variables. I used the symbolics library sympy and the function sympy.Matrix.rref, which gave me exactly what I wanted, since any solvable variable would have its own row with almost all zeros and 1 one, which I could check for row by row.

However, this solution was not stable. Primarily, it was exceedingly slow, and didn't make use of the fact that my datasets are likely to be very sparse. Moreover, rref doesn't do too well with floating points. So I decided to move on to another approach motivated by Removing unsolvable equations from an underdetermined system, which suggested using svd

Conveniently, there is a svd function in the scipy.sparse library, namely scipy.sparse.linalg.svds. However, given my lack of linear algebra background, I don't understand the results outputted by running this function on my table, or how to use those results to get what I want.

Further Details in the Problem

  1. The coefficient of every variable in my problem is 1. This is how the data can be expressed in the two column pandas DataFrame shown earlier
  2. The vast majority of variables in my actual examples will not be solvable. The goal is to find the few that are solvable
  3. I'm more than willing to try an alternate approach if it fits the constraints of this problem.

This is my first time posting a question, so I apologize if this doesn't exactly follow guidelines. Please leave constructive criticism but be gentle!

解决方案

The system you are solving has the form

[ 1 1 0 0 0 ] [a]   [A]
[ 0 0 1 1 0 ] [b] = [B]
[ 1 1 1 1 1 ] [c]   [C]
              [d]
              [e]

i.e., three equations for five variables a, b, c, d, e. As the answer linked in your question mentions, one can tackle such underdetermined system with the pseudoinverse, which Numpy directly provides in terms of the pinv function.

Since M has linearly independent rows, the psudoinverse has in this case the property that M.pinv(M) = I, where I denotes identity matrix (3x3 in this case). Thus formally, we can write the solution as:

v = pinv(M) . b

where v is the 5-component solution vector, and b denotes the right-hand side 3-component vector [A, B, C]. However, this solution is not unique, since one can add a vector from the so-called kernel or null space of the matrix M (i.e., a vector w for which M.w=0) and it will be still a solution:

M.(v + w) = M.v + M.w = b + 0 = b

Therefore, the only variables for which there is a unique solution are those for which the corresponding component of all possible vectors from the null space of M is zero. In other words, if you assemble the basis of the null space into a matrix (one basis vector per column), then the "solvable variables" will correspond to zero rows of this matrix (the corresponding component of any linear combination of the columns will be then also zero).

Let's apply this to your particular example:

import numpy as np
from numpy.linalg import pinv

M = [
    [1, 1, 0, 0, 0],
    [0, 0, 1, 1, 0],
    [1, 1, 1, 1, 1]
]

print(pinv(M))

[[ 5.00000000e-01 -2.01966890e-16  1.54302378e-16]
 [ 5.00000000e-01  1.48779676e-16 -2.10806254e-16]
 [-8.76351626e-17  5.00000000e-01  8.66819360e-17]
 [-2.60659800e-17  5.00000000e-01  3.43000417e-17]
 [-1.00000000e+00 -1.00000000e+00  1.00000000e+00]]

From this pseudoinverse, we see that the variable e (last row) is indeed expressible as - A - B + C. However, it also "predicts" that a=A/2 and b=A/2. To eliminate these non-unique solutions (equally valid would be also a=A and b=0 for example), let's calculate the null space borrowing the function from SciPy Cookbook:

print(nullspace(M))

[[ 5.00000000e-01 -5.00000000e-01]
 [-5.00000000e-01  5.00000000e-01]
 [-5.00000000e-01 -5.00000000e-01]
 [ 5.00000000e-01  5.00000000e-01]
 [-1.77302319e-16  2.22044605e-16]]

This function returns already the basis of the null space assembled into a matrix (one vector per column) and we see that, within a reasonable precision, the only zero row is indeed only the last one corresponding to the variable e.

EDIT:

For the set of equations

A = a + b, B = b + c, C = a + c

the corresponding matrix M is

[ 1 1 0 ]
[ 0 1 1 ]
[ 1 0 1 ]

Here we see that the matrix is in fact square, and invertible (the determinant is 2). Thus the pseudoinverse coincides with "normal" inverse:

[[ 0.5 -0.5  0.5]
 [ 0.5  0.5 -0.5]
 [-0.5  0.5  0.5]]

which corresponds to the solution a = (A - B + C)/2, .... Since M is invertible, its kernel / null space is empty, that's why the cookbook function returns only []. To see this, let's use the definition of the kernel - it is formed by all non-zero vectors x such that M.x = 0. However, since M^{-1} exists, x is given as x = M^{-1} . 0 = 0 which is a contradiction. Formally, this means that the found solution is unique (or that all variables are "solvable").

这篇关于使用SVD求解欠定的scipy.sparse矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆