奇异值分解:Jama,PColt和NumPy的结果不同 [英] Singular Value Decomposition: Different results with Jama, PColt and NumPy

查看:103
本文介绍了奇异值分解:Jama,PColt和NumPy的结果不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对大型(稀疏)矩阵执行奇异值分解.为了选择最好的(最准确的)库,我尝试复制提供的SVD示例

I want to perform Singular Value Decomposition on a large (sparse) matrix. In order to choose the best(most accurate) library, I tried replicating the SVD example provided here using different Java and Python libraries. Strangely I am getting different results with each library.

这是原始的示例矩阵,并且已分解(美国和VT)矩阵:

Here's the original example matrix and it's decomposed (U S and VT) matrices:

A =2.0  0.0 8.0 6.0 0.0
   1.0 6.0 0.0 1.0 7.0
   5.0 0.0 7.0 4.0 0.0
   7.0 0.0 8.0 5.0 0.0 
   0.0 10.0 0.0 0.0 7.0

U =-0.54 0.07 0.82 -0.11 0.12
   -0.10 -0.59 -0.11 -0.79 -0.06
   -0.53 0.06 -0.21 0.12 -0.81
   -0.65 0.07 -0.51 0.06 0.56
   -0.06 -0.80 0.09 0.59 0.04

VT =-0.46 0.02 -0.87 -0.00 0.17
    -0.07 -0.76 0.06 0.60 0.23
    -0.74 0.10 0.28 0.22 -0.56
    -0.48 0.03 0.40 -0.33 0.70
    -0.07 -0.64 -0.04 -0.69 -0.32

S (with the top three singular values) = 
   17.92 0 0
   0 15.17 0
   0 0 3.56

我尝试使用以下Java和Python库: Java:PColt,Jama Python:NumPy

I tried using the following Java and Python libraries: Java: PColt, Jama Python: NumPy

以下是其中每个结果:

Jama:
U = 0.5423  -0.065  -0.8216 0.1057  -0.1245 
    0.1018  0.5935  0.1126  0.7881  0.0603  
    0.525   -0.0594 0.213   -0.1157 0.8137  
    0.6449  -0.0704 0.5087  -0.0599 -0.5628 
    0.0645  0.7969  -0.09   -0.5922 -0.0441 

VT =0.4646  -0.0215 0.8685  8.0E-4  -0.1713 
    0.0701  0.76    -0.0631 -0.6013 -0.2278 
    0.7351  -0.0988 -0.284  -0.2235 0.565   
    0.4844  -0.0254 -0.3989 0.3327  -0.7035 
    0.065   0.6415  0.0443  0.6912  0.3233  

S = 17.9184 0.0 0.0 0.0 0.0 
    0.0 15.1714 0.0 0.0 0.0 
    0.0 0.0 3.564   0.0 0.0 
    0.0 0.0 0.0 1.9842  0.0 
    0.0 0.0 0.0 0.0 0.3496  

PColt:
U = -0.542255   0.0649957  0.821617  0.105747  -0.124490 
    -0.101812  -0.593461 -0.112552  0.788123   0.0602700
    -0.524953   0.0593817 -0.212969 -0.115742   0.813724 
    -0.644870   0.0704063 -0.508744 -0.0599027 -0.562829 
    -0.0644952 -0.796930  0.0900097 -0.592195  -0.0441263

VT =-0.464617   0.0215065 -0.868509   0.000799554 -0.171349
    -0.0700860 -0.759988  0.0630715 -0.601346   -0.227841
    -0.735094   0.0987971  0.284009  -0.223485    0.565040
    -0.484392   0.0254474  0.398866   0.332684   -0.703523
    -0.0649698 -0.641520 -0.0442743  0.691201    0.323284

S = 
(00)    17.91837085874625
(11)    15.17137188041607
(22)    3.5640020352605677
(33)    1.9842281528992616
(44)    0.3495556671751232


Numpy

U = -0.54225536  0.06499573  0.82161708  0.10574661 -0.12448979
    -0.10181247 -0.59346055 -0.11255162  0.78812338  0.06026999
    -0.52495325  0.05938171 -0.21296861 -0.11574223  0.81372354
    -0.64487038  0.07040626 -0.50874368 -0.05990271 -0.56282918
    -0.06449519 -0.79692967  0.09000966 -0.59219473 -0.04412631

VT =-4.64617e-01   2.15065e-02  -8.68508e-01    7.99553e-04  -1.71349e-01
    -7.00859e-02  -7.59987e-01   6.30714e-02   -6.01345e-01  -2.27841e-01
    -7.35093e-01   9.87971e-02   2.84008e-01   -2.23484e-01   5.65040e-01
    -4.84391e-01   2.54473e-02   3.98865e-01    3.32683e-01  -7.03523e-01
    -6.49698e-02  -6.41519e-01  -4.42743e-02    6.91201e-01   3.23283e-01

S = 17.91837086  15.17137188   3.56400204   1.98422815   0.34955567

可以注意到,在Jama分解矩阵(u& VT)中每个元素的符号与原始示例中的符号相反.有趣的是,对于PColt和Numpy,仅将最后两列中元素的符号反转.倒置标志背后有什么具体原因吗?有人遇到过类似的差异吗?

As can be noticed the sign of each element in the Jama decomposed matrices (u & VT) is opposite to the ones in the original example. Interestingly, for PColt and Numpy only the signs of the elements in the last two columns are inverted. Is there any specific reason behind the inverted signs? Has someone faced similar discrepancies?

这是我使用的代码段: Java

Here are the pieces of code which I used: Java

import java.text.DecimalFormat;
import cern.colt.matrix.tdouble.DoubleMatrix2D;
import cern.colt.matrix.tdouble.algo.DenseDoubleAlgebra;
import cern.colt.matrix.tdouble.algo.decomposition.DenseDoubleSingularValueDecomposition;
import cern.colt.matrix.tdouble.impl.DenseDoubleMatrix2D;
import Jama.Matrix;
import Jama.SingularValueDecomposition;
public class SVD_Test implements java.io.Serializable{

    public static void main(String[] args)
    {   

        double[][] data2 = new double[][]
                {{ 2.0, 0.0, 8.0, 6.0, 0.0},
                { 1.0, 6.0, 0.0, 1.0, 7.0},
                { 5.0, 0.0, 7.0, 4.0, 0.0},
                { 7.0, 0.0, 8.0, 5.0, 0.0},
                { 0.0, 10.0, 0.0, 0.0, 7.0}};

        DoubleMatrix2D pColt_matrix = new DenseDoubleMatrix2D(5,5);
        pColt_matrix.assign(data2);
        Matrix j = new Matrix(data2);

        SingularValueDecomposition svd_jama = j.svd();

        DenseDoubleSingularValueDecomposition svd_pColt = new DenseDoubleSingularValueDecomposition(pColt_matrix, true, true);
        System.out.println("U:");
        System.out.println("pColt:");
        System.out.println(svd_pColt.getU());
        printJamaMatrix(svd_jama.getU());
        System.out.println("S:");
        System.out.println("pColt:");
        System.out.println(svd_pColt.getS());
        printJamaMatrix(svd_jama.getS());
        System.out.println("V:");
        System.out.println("pColt:");
        System.out.println(svd_pColt.getV());
        printJamaMatrix(svd_jama.getV());

    }

    public static void printJamaMatrix(Matrix inp){
        System.out.println("Jama: ");
        System.out.println(String.valueOf(inp.getRowDimension())+" X "+String.valueOf(inp.getColumnDimension()));
        DecimalFormat twoDForm = new DecimalFormat("#.####");
        StringBuffer sb = new StringBuffer();
        for (int r = 0; r < inp.getRowDimension(); r++) {
            for (int c = 0; c < inp.getColumnDimension(); c++)
                sb.append(Double.valueOf(twoDForm.format(inp.get(r, c)))).append("\t");
            sb.append("\n");
        }
        System.out.println(sb.toString());      
    }   
}

Python:

>>> import numpy
>>> numpy_matrix = numpy.array([[ 2.0, 0.0, 8.0, 6.0, 0.0], 
                [1.0, 6.0, 0.0, 1.0, 7.0], 
                [5.0, 0.0, 7.0, 4.0, 0.0], 
                [7.0, 0.0, 8.0, 5.0, 0.0], 
                [0.0, 10.0, 0.0, 0.0, 7.0]])
>>> u,s,v = numpy.linalg.svd(numpy_matrix, full_matrices=True)

代码是否有问题?

推荐答案

没有问题:s.v.d.在U和V列的符号更改之前不是唯一的(即,如果更改U的第i列的符号和V的第i列的符号,则仍然具有有效的svd:A = U * S * V ^ T). svd的不同实现将产生略有不同的结果:要检查正确的svd,您必须计算norm(A-U * S * V ^ T)/norm(A)并验证其为小数.

Nothing wrong: the s.v.d. is not unique up to a sign change of the columns of U and V. (i.e. if you change the sign of i-th column of U and the i-th column of V, you still have a valid s.v.d: A = U*S*V^T). Different implementations of the svd will give slightly different results: to check for a correct svd you have to compute norm(A-U*S*V^T) / norm(A) and verify that it is a small number.

这篇关于奇异值分解:Jama,PColt和NumPy的结果不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆