如何找到对PCA最具贡献的功能? [英] How to find most contributing features to PCA?

查看:166
本文介绍了如何找到对PCA最具贡献的功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在数据上运行PCA(约250个功能),看到所有点都聚集在3个斑点中.

是否有可能看到250个功能中哪个最有助于结果?如果可以,怎么办?

(使用Scikit-learn实现)

解决方案

让我们看看维基百科怎么说:

PCA在数学上定义为正交线性变换,该变换将数据转换为新坐标系统,使得数据的某些投影产生的最大方差位于第一个坐标(称为第一个坐标)上.主成分),第二个坐标上的第二大方差,依此类推.

要获得较小空间中原始空间中向量的影响力",还必须对其进行投影.可以通过以下方式完成:

res = pca.transform(np.eye(D))

  • np.eye(n)创建一个n x n对角矩阵(对角线上一个,否则为0).
  • 因此,np.eye(D)是您在原始要素空间中的要素
  • res是您的要素在较低空间中的投影.

有趣的是,res是一个D x d矩阵,其中res [i] [j]表示特征 i 对分量 j 的贡献量"

然后,您可以对列进行求和以获得一个D x 1矩阵(称为 contributiion ,其中每个contribution[i]是特征 i 的总贡献./p>

对其进行排序,您会找到最有帮助的功能:)

不确定是否可以添加任何其他信息.

希望这会有所帮助, 丰富

I am running PCA on my data (~250 features) and see that all points are clustered in 3 blobs.

Is it possible to see which of the 250 features have been most contributing to the outcome? if so how?

(using the Scikit-learn implementation)

解决方案

Let's see what wikipedia says:

PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new coordinate system such that the greatest variance by some projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on.

To get how 'influent' are vectors from original space in the smaller one you have to project them as well. Which is done by:

res = pca.transform(np.eye(D))

  • np.eye(n) creates a n x n diagonal matrix (one on diagonal, 0 otherwise).
  • Thus, np.eye(D) is your features in original feature space
  • res is the projection of your features in lower space.

The interesting thing is that res is a D x d matrix where res[i][j] represent "how much feature i contribute to component j"

Then, you may just sum over columns to get a D x 1 matrix (call it contributiion where each contribution[i] is the total contribution of feature i.

Sort it and you find the most contributing feature :)

Not sure its clear, could add any kind of additional information.

Hope this helps, pltrdy

这篇关于如何找到对PCA最具贡献的功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆