采取几个主要成分? [英] How many principal components to take?

查看:55
本文介绍了采取几个主要成分?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道主成分分析会对矩阵进行SVD​​,然后生成特征值矩阵.要选择主要成分,我们仅需采用前几个特征值.现在,我们如何确定应从特征值矩阵中获取的特征值的数量?

I know that principal component analysis does a SVD on a matrix and then generates an eigen value matrix. To select the principal components we have to take only the first few eigen values. Now, how do we decide on the number of eigen values that we should take from the eigen value matrix?

推荐答案

要确定要保留多少个特征值/特征向量,应首先考虑进行PCA的原因.您是为了降低存储需求,降低分类算法的维数还是其他原因而这样做?如果没有严格的限制,建议您绘制特征值的累加总和(假设它们按降序排列).如果在绘图之前将每个值除以特征值的总和,则绘图将显示保留的总方差与特征值数量的比例.然后,该图将很好地指示您何时到达收益递减点(即,通过保留其他特征值几乎不会产生差异).

To decide how many eigenvalues/eigenvectors to keep, you should consider your reason for doing PCA in the first place. Are you doing it for reducing storage requirements, to reduce dimensionality for a classification algorithm, or for some other reason? If you don't have any strict constraints, I recommend plotting the cumulative sum of eigenvalues (assuming they are in descending order). If you divide each value by the total sum of eigenvalues prior to plotting, then your plot will show the fraction of total variance retained vs. number of eigenvalues. The plot will then provide a good indication of when you hit the point of diminishing returns (i.e., little variance is gained by retaining additional eigenvalues).

这篇关于采取几个主要成分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆