如何从scikit-learn运行和解释Fisher线性判别分析 [英] How to run and interpret Fisher's Linear Discriminant Analysis from scikit-learn

查看:83
本文介绍了如何从scikit-learn运行和解释Fisher线性判别分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行Fisher的LDA( 1

I am trying to run a Fisher's LDA (1, 2) to reduce the number of features of matrix.

如果我错了,基本上可以纠正,考虑到n个样本被分为几类,Fisher的LDA试图找到一个投影在其上的轴应使值J(w)最大化,该值是样本总方差与总和之比.各个类别中的差异.

Basically, correct if I am wrong, given n samples classified in several classes, Fisher's LDA tries to find an axis that projecting thereon should maximize the value J(w), which is the ratio of total sample variance to the sum of variances within separate classes.

我认为这可以用来为每个类找到最有用的功能.

I think this can be used to find the most useful features for each class.

我有一个包含m个特征和n个样本(m行,n列)的矩阵X.

I have a matrix X of m features and n samples (m rows, n columns).

我有一个样本分类y,即n个标签的数组,每个样本一个.

I have a sample classification y, i.e. an array of n labels, each one for each sample.

基于y,我想将特征数量减少到例如3个最具代表性的特征.

Basing on y I want to reduce the number of features to, for example, 3 most representative features.

使用scikit-learn我以这种方式尝试过(遵循此文档):

Using scikit-learn I tried in this way (following this documentation):

>>> import numpy as np
>>> from sklearn.lda import LDA
>>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
>>> y = np.array([1, 1, 1, 2, 2, 2])
>>> clf = LDA(n_components=3)
>>> clf.fit_transform(X, y)
array([[ 4.],
   [ 4.],
   [ 8.],
   [-4.],
   [-4.],
   [-8.]])

在这一点上,我有点困惑,如何获得最具代表性的功能?

At this point I am a bit confused, how to obtain the most representative features?

推荐答案

在安装分类器后,所需的功能就在clf.coef_中.

The features you are looking for are in clf.coef_ after you have fitted the classifier.

请注意,在这里n_components=3没有任何意义,因为X.shape[1] == 2,即您的特征空间只有二维.

Note that n_components=3 doesn't make sense here, since X.shape[1] == 2, i.e. your feature space only has two dimensions.

您无需调用fit_transform即可获取coef_,只需调用clf.fit(X, y)就可以了.

You do not need to invoke fit_transform in order to obtain coef_, calling clf.fit(X, y) will suffice.

这篇关于如何从scikit-learn运行和解释Fisher线性判别分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆