如何在Python中使用PCA/SVD进行特征选择和识别? [英] How can I use PCA/SVD in Python for feature selection AND identification?

查看:769
本文介绍了如何在Python中使用PCA/SVD进行特征选择和识别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在跟踪 Python中的主成分分析,以便在Python下使用PCA,但是在确定要选择哪个功能(即我的哪些列/功能具有最佳差异)方面很困难.

I'm following Principal component analysis in Python to use PCA under Python, but am struggling with determining which features to choose (i.e. which of my columns/features have the best variance).

当我使用scipy.linalg.svd时,它会自动对我的奇异值进行排序,因此我无法知道它们属于哪一列.

When I use scipy.linalg.svd, it automatically sorts my Singular Values, so I can't tell which column they belong to.

示例代码:

import numpy as np
from scipy.linalg import svd
M = [
     [1, 1, 1, 1, 1, 1],
     [3, 3, 3, 3, 3, 3],
     [2, 2, 2, 2, 2, 2],
     [9, 9, 9, 9, 9, 9]
]
M = np.transpose(np.array(M))
U,s,Vt = svd(M, full_matrices=False)
print s

在不对奇异值进行排序的情况下,还有其他方法吗?

Is there a different way to go about this without the Singular Values being sorted?

更新:至少根据Matlab论坛上的这篇帖子,看来这不可能实现:

Update: It looks like this might not be possible, at least according to this post on the Matlab forums: http://www.mathworks.com/matlabcentral/newsreader/view_thread/241607. If anyone knows otherwise, let me know :)

推荐答案

我误以为PCA进行了特征选择,而进行了特征提取.

I was under the wrong impression that PCA did feature selection, whereas instead it does feature extraction.

相反,PCA创建了一系列新功能,每个功能都是输入功能的组合.

Instead, PCA creates a new series of features, each of which is a combination of the input features.

在PCA中,如果您确实要进行功能选择,则可以查看PCA创建的功能上输入功能的权重.例如,matplotlib.mlab.PCA库在属性中提供权重(有关图书馆的更多信息):

From PCA, if you really wanted to do feature selection, you could look at the weightings of the input features on the PCA created features. For instance, the matplotlib.mlab.PCA library provides the weights in a property (more on library):

from matplotlib.mlab import PCA
res = PCA(data)
print "weights of input vectors: %s" % res.Wt

使用特征提取路线之类的声音是使用PCA的方法.

Sounds like the feature extraction route is the way to use PCA though.

这篇关于如何在Python中使用PCA/SVD进行特征选择和识别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆