如何使用通过PCA获得的特征向量来重新投影我的数据? [英] How to use eigenvectors obtained through PCA to reproject my data?

查看:288
本文介绍了如何使用通过PCA获得的特征向量来重新投影我的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在100张图片上使用PCA。我的训练数据是 442368x100 double 矩阵。 442368是特征,100是图像的数量。这是我找到特征向量的代码。

I am using PCA on 100 images. My training data is 442368x100 double matrix. 442368 are features and 100 is number of images. Here is my code for finding the eigenvector.

[ rows, cols] = size(training);
maxVec=rows;
maxVec=min(maxVec,rows);
train_mean=mean(training,2);
A=training-train_mean*ones(1,cols);
A=A'*A;
[evec,eval]=eig(A);
[eval ind]  =  sort(-1*diag(eval));
evec= evec(:, ind(1:100));

现在 evec 是订单的特征向量矩阵100x100双倍,现在我有100个特征向量排序。

Now evec is an eigenvector matrix of order of 100x100 double and now I have got 100 eigenvectors sorted.

问题:

现在,如果我想使用上面计算的特征向量转换我的测试数据,那么我该如何使用这些特征向量?我的测试数据是 442368x50 double 但我的特征向量矩阵 100x100 double 。内部矩阵尺寸不一致。如何找到我的测试数据和特征向量矩阵的点积?

Now, if I want to transform my testing data using above calculated eigenvectors then how can I use these eigenvectors? My testing data is 442368x50 double but my eigenvector matrix is 100x100 double. The inner matrix dimensions don't agree. How can I find the dot product of my testing data and eigenvector matrix?

推荐答案

你在做什么本质上是维度降低。您目前拥有前100个特征向量,用于确定保留数据中最大差异的基础向量。您现在要做的是将测试数据投影到这些相同的基础向量上。顺便说一句,您的协方差矩阵计算确实存在错误。这是在每个功能的基础上执行的,但您是在每个图片的基础上执行此操作....所以这不正确。您必须在计算中交换转置的顺序。您还必须除以示例总数减1才能完成计算并生成无偏估计量

What you are doing is essentially dimensionality reduction. You currently have the top 100 eigenvectors that determine the basis vectors that retain the largest variance in your data. What you want to do now is to project your test data onto these same basis vectors. BTW, you do have an error with your covariance matrix calculation. This is performed on a per feature basis but you are performing this on a per image basis.... so that's not correct. You have to swap the order of the transpose in your calculation. You also must divide by the total number of examples minus 1 to complete the calculation and produce an unbiased estimator:

A = (1/(cols-1))*(A*A.');

首先转置 A 然后乘以假定每个列是一个功能,但事实并非如此。如果你从维数减少中回忆起来,我们目前有一个特征向量矩阵,其中每列是一个特征向量。如果你想最终执行缩减,它只是数据矩阵的乘法,即用特征向量矩阵减去的平均值。重要的是要注意,该矩阵中的特征向量的顺序使得包含可由您的数据解释的最大方差的基础向量首先出现。这就是为什么在特征值上进行排序的原因,因为具有最大特征值的特征向量体现了这个特性。但是,此操作假定每个是一个要素,而您的数据矩阵是每个都是一个要素。如果要对原始训练数据执行重建,则需要在进行此乘法之前转置平均减去的数据。但是,这将使每个示例连续。从您的代码中,每个都是一个示例,因此您可以转换特征向量矩阵:

Transposing A first then multiplying assumes that each column is a feature but this is not the case for you. If you recall from dimensionality reduction, we currently have a matrix of eigenvectors where each column is an eigenvector. If you want to finally perform the reduction, it is simply a multiplication of the data matrix that is mean subtracted with the eigenvector matrix. It is important to note that the order of the eigenvectors in this matrix is such that the basis vector encompassing the largest variance that can be explained by your data appears first. That is why the sorting is performed on the eigenvalues as the eigenvector with the largest eigenvalue embodies this property. However, this operation assumes that each column is a feature and your data matrix is such that each row is a feature. If you want to perform reconstruction on your original training data, you'll need to transpose the mean subtracted data before doing this multiplication. However, this will make each example in a row. From your code, each column is an example and so you can transpose the eigenvector matrix instead:

% Assuming you did (1/(cols-1))*(A*A.') to compute the eigenvectors
Atrain = training - train_mean*ones(1, cols);
Areconstruct = evec.' * Atrain;

Areconstruct 将包含重建数据列是相应的重新插入的示例。我还需要存储平均减去的特征矩阵,因为你的代码用协方差矩阵覆盖它。如果要对测试数据执行此重投影,必须表示从训练数据中计算出的要素减去,然后应用上面的乘法。假设您的数据存储在 test_data 中,只需执行以下操作:

Areconstruct will contain the reconstructed data where each column is corresponding reprojected example. I also needed to store the mean subtracted feature matrix as your code overwrites this with the covariance matrix. If you want to perform this reprojection on your test data, you must mean subtract with the features computed from your training data, then apply the multiplication above. Assuming your data is stored in test_data, simply do the following:

cols_test = size(test_data, 2);
B = test_data - train_mean*ones(1, cols_test);
Breconstruct = evec.' * B;

Breconstruct 包含基于重新投影的数据现在是 100 x 50 矩阵的向量,其中每列是来自测试数据的重新投影示例。

Breconstruct contains the reprojected data onto the basis vectors which will now be a 100 x 50 matrix where each column is a reprojected example from your test data.

这个代码可能会运行得很慢或者最糟糕的情况根本没有运行,因为你的协方差矩阵是相当的体积很大。在尝试降低维数之前,最好尽可能减少 a priori 的功能总数。正如您在评论中所述,每个示例都只是图像的展开版本作为长向量,因此请尝试将图像调整为可管理的大小。另外,通常习惯于在使用之前对经过调整大小的图像进行低通滤波(例如高斯模糊),因为它可以防止混叠。

This code will probably run very slow or worst case not run at all as your covariance matrix is quite large in size. It is highly advisable that you reduce the total number of features a priori as much as possible before trying dimensionality reduction. As you have stated in your comments, each example is simply an unrolled version of an image as a long vector so try resizing the image down to something manageable. In addition, it is usually customary to low-pass filter (Gaussian blur for example) the resized image prior to using as it prevents aliasing.

另外,请查看我在本文后面对使用奇异值分解的建议。它应该比使用协方差矩阵的特征向量更快。

Also, check out the recommendation I have for using Singular Value Decomposition later on in this post. It should be faster than using the eigenvectors of the covariance matrix.

我会使用 bsxfun 您也可以使用 sort ,带有下降标志,所以你没有在排序之前将您的值乘以-1以按降序获取索引。 bsxfun 允许您有效地意味着减去您的功能而不执行重复,即为您拥有的许多示例重复每个功能的平均值(即使用 ones(1,cols))。

I would improve on this code by using bsxfun and also you can use sort with the descend flag so you don't have to multiply your values by -1 prior to sorting to get the indices in descending order. bsxfun allows you to efficiently mean subtract your features without performing the duplication of having the mean of each feature be repeated for as many examples that you have (i.e. with ones(1, cols)).

具体来说:

[ rows, cols] = size(training);
maxVec=rows;
maxVec=min(maxVec,rows);
train_mean=mean(training,2);
A = bsxfun(@minus, training, train_mean); % Change
%A=training-train_mean*ones(1,cols); 
Acov = (1/(cols-1))*(A*A.'); % Change - correct formula
[evec,eval]=eig(Acov);
%[eval ind]  =  sort(-1*diag(eval));
[eval, ind]  =  sort(diag(eval), 'descend'); % Change
evec= evec(:, ind(1:100));

最后为您的测试数据:

B = bsxfun(@minus, test_data, train_mean);
Breconstruct = evec.' * B;






一条忠告 - 使用SVD



使用特征向量进行降维已知是不稳定的 - 特别是在计算高维数据的特征向量时,例如你所拥有的。建议您使用奇异值分解(SVD)框架来代替。您可以在协方差矩阵的特征向量之间的关系上查看此Cross Validated帖子,并使用SVD执行PCA:


A word of advice - Use SVD

Using the eigenvectors for dimensionality reduction is known to be unstable - specifically when it comes to computing eigenvectors for high dimensional data such as what you have. It is advised that you use the Singular Value Decomposition (SVD) framework to do so instead. You can view this Cross Validated post on the relationship between the eigenvectors of the covariance matrix and using SVD for performing PCA:

https://stats.stackexchange.com/questions/134282/relationship-between -svd-and-pca-how-to-use-svd-to-perform-pca

因此,计算协方差矩阵上的SVD和 V 的列是执行计算所需的特征向量。 SVD的附加好处是特征向量已根据它们的方差排序,因此 V 的第一列将是指向的基础向量方差最大的方向。因此,您不需要像使用特征向量那样进行任何排序。

As such, compute the SVD on the covariance matrix and the columns of V are the eigenvectors you need to perform the computation. The added benefit with SVD is the eigenvectors are already ordered based on their variance and so the first column of V would be the basis vector that points in the direction with the largest variance. As such, you don't need to do any sorting as you did with the eigenvectors.

因此,您可以将其与SVD一起使用:

Therefore, you would use this with the SVD:

Acov = (1/(cols-1))*(A*A.'); 
[U,S,V] = svd(Acov);
Areconstruct = V(:, 1:100).' * A;

对于您的测试数据:

B = bsxfun(@minus, test_data, train_mean);
Breconstruct = V(:, 1:100).' * B;






进一步阅读



您可以使用我的答案中的协方差矩阵中的特征向量和特征值来查看我关于维数降低的帖子:在协方差矩阵中选择最大特征值和特征向量在数据分析中是什么意思?

它还简要概述了为执行PCA或减少维数而执行此操作的原因。但是,我强烈建议您使用SVD来做您需要的事情。它比使用协方差矩阵的特征向量更快更稳定。

It also gives you a brief overview of why this operation is done to perform PCA or dimensionality reduction. However, I would highly advise you use SVD to do what you need. It is faster and more stable than using the eigenvectors of the covariance matrix.

这篇关于如何使用通过PCA获得的特征向量来重新投影我的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆