通过三种方法计算CCA [英] Computing CCA through three approaches

查看:224
本文介绍了通过三种方法计算CCA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近研究了CCA的概念,并想在MATLAB中实现它.但是,存在现有的matlab命令 canoncorr .我想编写自己的代码.我已经对其进行了广泛的研究,发现了三种方法:

I have recently studied the concepts of CCA and wanted to implement it in MATLAB. However there is an existing matlab command canoncorr present. I wanted to write my own code. I have studied it extensively and found three approaches :

1:Hardoon:该方法使用拉格朗日乘数将问题分解为广义特征值问题.可以在这里找到代码: cca_hardoon 为了保持理智,我还在此处提供代码:数据必须事先居中.

1: Hardoon : The approach uses lagrange multipliers to decompose the problem into an generalised eigenvalue problem. The code can be found here : cca_hardoon For sanity sake I am also giving the code here : The data has to be centered previously.

function [Wx, Wy, r] = cca(X,Y)

% CCA calculate canonical correlations
%
% [Wx Wy r] = cca(X,Y) where Wx and Wy contains the canonical correlation
% vectors as columns and r is a vector with corresponding canonical
% correlations.
%
% Update 31/01/05 added bug handling.

if (nargin ~= 2)
  disp('Inocorrect number of inputs');
  help cca;
  Wx = 0; Wy = 0; r = 0;
  return;
end


% calculating the covariance matrices
z = [X; Y];
C = cov(z.');
sx = size(X,1);
sy = size(Y,1);
Cxx = C(1:sx, 1:sx) + 10^(-8)*eye(sx);
Cxy = C(1:sx, sx+1:sx+sy);
Cyx = Cxy';
Cyy = C(sx+1:sx+sy,sx+1:sx+sy) + 10^(-8)*eye(sy);

%calculating the Wx cca matrix
Rx = chol(Cxx);
invRx = inv(Rx);
Z = invRx'*Cxy*(Cyy\Cyx)*invRx;
Z = 0.5*(Z' + Z);  % making sure that Z is a symmetric matrix
[Wx,r] = eig(Z);   % basis in h (X)
r = sqrt(real(r)); % as the original r we get is lamda^2
Wx = invRx * Wx;   % actual Wx values

% calculating Wy
Wy = (Cyy\Cyx) * Wx; 

% by dividing it by lamda
Wy = Wy./repmat(diag(r)',sy,1);

2. MATLAB方法.请注意,数据居中是在代码本身内完成的.

2. MATLAB approach Please note the centering of data is done within the code itself.

3.仅通过普通SVD进行的CCA::这种方法不需要进行qr分解,而仅使用svd分解.我在这里提到这篇文章的顶部: svd的cca .请参阅以下文本文章,这些文章摘自引用的文章.

3. CCA by Normal SVD only : This approach does not require the qr decomposition and utilizes the svd decomposition only. I have referred top this article here : cca by svd. Please refer to the text articles below which are taken from the referred article.

我尝试自己编写此程序,但未成功.

I have tried to code this program myself but unsuccessfully.

function [A,B,r,U,V] = cca_by_svd(x,y)
% computing the means
N = size(x,1); mu_x = mean(x,1); mu_y = mean(y,1);
% substracting the means
x = x - repmat(mu_x,N,1); y = y - repmat(mu_y,N,1);

x = x.'; y = y.';
% computing the covariance matrices
Cxx = (1/N)*x*(x.');  Cyy = (1/N)*y*(y.'); Cxy = (1/N)*x*(y.');

%dimension
m = min(rank(x),rank(y));
%m = min(size(x,1),size(y,1));

% computing the quare root inverse of the matrix
[V,D]=eig(Cxx); d = diag(D);
% Making all the eigen values positive
d = (d+abs(d))/2; d2 = 1./sqrt(d); d2(d==0)=0; Cxx_iv=V*diag(d2)*inv(V);

% computing the quare root inverse of the matrix
[V,D]=eig(Cyy); d = diag(D);
% Making all the eigen values positive
d = (d+abs(d))/2; d2 = 1./sqrt(d); d2(d==0)=0; Cyy_iv=V*diag(d2)*inv(V);

Omega = Cxx_iv*Cxy*Cyy_iv;
[C,Sigma,D] = svd(Omega);
A = Cxx_iv*C; A = A(:,1:m);
B = Cyy_iv*D.'; B = B(:,1:m);
A = real(A); B = real(B);
U = A.'*x; V = B.'*y;
r = Sigma(1:m,1:m);

我正在运行以下代码段:

I am running this code snippet:

clc;clear all;close all;
load carbig;
X = [Displacement Horsepower Weight Acceleration MPG];
nans = sum(isnan(X),2) > 0;
x = X(~nans,1:3);
y = X(~nans,4:5);
[A1, B1, r1, U1, V1] = canoncorr(x, y);

[A2, B2, r2, U2, V2] = cca_by_svd(x, y);

[A3, B3, r3] = cca(x.',y.',1);

投影向量就是这样:

>> A1
A1 =
    0.0025    0.0048
    0.0202    0.0409
   -0.0000   -0.0027
>> A2
A2 =
    0.0025    0.0048
    0.0202    0.0410
   -0.0000   -0.0027
>> A3
A3 =
   -0.0302   -0.0050   -0.0022
    0.0385   -0.0420   -0.0176
    0.0020    0.0027   -0.0001
>> B1
B1 =
   -0.1666   -0.3637
   -0.0916    0.1078
>> B2
B2 =
   -0.1668   -0.3642
   -0.0917    0.1079
>> B3
B3 =
   0.0000 + 0.0000i   0.3460 + 0.0000i   0.1336 + 0.0000i
   0.0000 + 0.0000i  -0.0967 + 0.0000i   0.0989 + 0.0000i

问题:有人可以告诉我我要去哪里了.我所提到的三种方法都解决了相同的问题,理想情况下,它们的解决方案应该融合在一起.我承认我的代码'cca_by_svd'可能是错误的,但是hardoon的代码和matlab的输出应该相同.请向我指出我要去哪里了. 修改,我已重新检查并更正了我的代码.现在,对于此数据集,方法2和3收敛.

Question: Can someone please tell me where I am going wrong. The three approaches that I have referred all solve the same problem and ideally their solutions should converge. I admit my code 'cca_by_svd' may be wrong but hardoon's code and matlab's output should be same. Please point out to me where I am going wrong. edit I have rechecked and corrected my code. Now for this dataset the method 2 and 3 converge.

推荐答案

在某些方面,cca(X,Y)不能做到canoncorr可以做到:

There's a few things that cca(X,Y) doesn't do that canoncorr does:

一个正在规范化数据.如果将X = normc(X')'(也用于Y)添加到cca(X,Y)函数,则输出r将与canoncorr的输出匹配.如果查看canoncorr的代码,您会发现它是通过XY的QR分解开始的.

One is normalizing the data. If you add X = normc(X')' (also for Y) to your cca(X,Y) function, the output r will match that of canoncorr. If you look into canoncorr's code, you'll see that it starts by QR decomposition of X and Y.

另一个区别是eig以升序对特征值进行排序,因此cca(X,Y)应该翻转eig(Z)的输出.

Another difference is that eig sorts the eigenvalues in ascending order, so cca(X,Y) should flip the output of eig(Z).

注意:尽管纠正了这些差异,但我无法完全恢复Wx和Wy来匹配canoncorr的输出.理想情况下,在ccacanoncorr之间Wx'* Wx应该看起来完全一样.

NOTE: Despite correcting these differences, I wasn't able to fully recover Wx and Wy to match the outputs of canoncorr. Ideally, Wx'*Wx should look exactly alike between cca and canoncorr.

这篇关于通过三种方法计算CCA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆