带sklearn的PCA.无法弄清楚PCA的功能选择 [英] PCA with sklearn. Unable to figure out feature selection with PCA

查看:101
本文介绍了带sklearn的PCA.无法弄清楚PCA的功能选择的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试使用PCA降低尺寸.我目前有一个尺寸为(100,100)的图像,并且我使用的是由140个Gabor滤波器组成的滤波器组,其中每个滤波器给我的响应都是(100,100)图像.现在,我想进行功能选择,而我只想选择非冗余功能,而我读到PCA可能是一个好方法.

I have been trying to do some dimensionality reduction using PCA. I currently have an image of size (100, 100) and I am using a filterbank of 140 Gabor filters where each filter gives me a response which is again an image of (100, 100). Now, I wanted to do feature selection where I only wanted to select non-redundant features and I read that PCA might be a good way to do.

因此,我继续创建具有10000行和140列的数据矩阵.因此,每一行都包含该滤波器组的Gabor滤波器的各种响应.现在,据我所知,我可以使用PCA作为该矩阵的分解

So I proceeded to create a data matrix which has 10000 rows and 140 columns. So, each row contains the various responses of the Gabor filters for that filterbank. Now, as I understand it I can do a decomposition of this matrix using PCA as

from sklearn.decomposition import PCA

pca = pca(n_components = 3)
pca.fit(Q) # Q is my 10000 X 140 matrix

但是,现在我对如何找出这140个特征向量中的哪一个感到困惑.我猜想它应该给我这140个向量中的3个(对应于包含有关图像最多信息的Gabor滤波器),但是我不知道如何从这里开始.

However, now I am confused as to how I can figure out which of these 140 feature vectors to keep from here. I am guessing it should give me 3 of these 140 vectors (corresponding to the Gabor filters which contain the most information about the image) but I have no idea how to proceed from here.

推荐答案

PCA将为您提供功能的线性组合,而不是功能的组合.它将为您提供从L2意义上最适合重构的线性组合,也就是捕获最大方差的组合.

PCA will give you a linear combination of features, not a selection of features. It will give you the linear combination that is the best for reconstruction in the L2 sense, aka the one that captures the most variance.

您的目标是什么?如果您在一幅图像上执行此操作,则任何类型的选择都将为您提供将图像的某些部分与同一图像的其他部分区分开的最佳功能.

What is you goal? If you do this on one image, any kind of selection will give you features that will discriminate best some parts of an image against other parts of the same image.

此外:Garbor滤镜是自然图像的稀疏基础.除非您有非常具体的图像,否则我不会期望发生任何有趣的事情.

Also: Garbor Filters are a sparse basis for natural images. I would not expect anything interesting to happen unless you have very specific images.

这篇关于带sklearn的PCA.无法弄清楚PCA的功能选择的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆