Fisher在Python中的线性判别式 [英] fisher's linear discriminant in Python

查看:35
本文介绍了Fisher在Python中的线性判别式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有费舍尔的线性判别式,我需要用它来将高维矩阵的示例A和B简化为简单的2D,这与LDA完全一样,每个示例都具有A和B类,因此如果我要有第三个示例,它们也具有A和B类,第四个,第五个和n个示例将始终具有A和B类,因此我想简单地使用Fisher线性判别法将它们分开.对于机器学习来说,我几乎是新手,所以我不知道如何分隔班级,我一直在随便遵循公式,并随时随地进行编码.从我正在阅读的内容中,我需要对数据进行线性变换,以便为它找到一个好的阈值,但是首先,我需要找到最大化函数.对于这样的任务,我设法找到了Sw和Sb,但是我不知道该怎么去...

我还需要找到最大化函数的地方.

该最大化函数为我提供了一个特征值解决方案:

我为每个类提供的是2个示例的矩阵5x2.例如:

 示例1Class_A = [201、103,40、4323、50,12、123,99、78]Class_B = [201、129,114、195,180、90,69、62,76、90]例子2Class_A = [68、98,201、203,78、212,49、5204、78]Class_B = [52、19,220、219,159、195,99、23,46、50] 

我尝试为上面的示例查找Sw,如下所示:

  Example_1_Class_A = np.dot(Example_1_Class_A,np.transpose(Example_1_Class_A))Example_1_Class_B = np.dot(Example_1_Class_B,np.transpose(Example_1_Class_B))Example_2_Class_A = np.dot(Example_2_Class_A,np.transpose(Example_2_Class_A))Example_2_Class_B = np.dot(Example_2_Class_B,np.transpose(Example_2_Class_B))Sw = sum([Example_1_Class_A, Example_1_Class_B, Example_2_Class_A, Example_2_Class_B], 轴=0) 

对于Sb,我尝试这样:

  Example_1_Class_A_mean = Example_1_Class_A.mean(axis = 0)Example_1_Class_B_mean = Example_1_Class_B.mean(axis=0)Example_2_Class_A_mean = Example_2_Class_A.mean(axis = 0)Example_2_Class_B_mean = Example_2_Class_B.mean(axis = 0)Example_1_Class_A_Sb = np.dot(Example_1_Class_A_mean,np.transpose(Example_1_Class_A_mean))Example_1_Class_B_Sb = np.dot(Example_1_Class_B_mean,np.transpose(Example_1_Class_B_mean))Example_2_Class_A_Sb = np.dot(Example_2_Class_A_mean,np.transpose(Example_2_Class_A_mean))Example_2_Class_B_Sb = np.dot(Example_2_Class_B_mean,np.transpose(Example_2_Class_B_mean))Sb = sum([Example_1_Class_A_Sb,Example_1_Class_B_Sb,Example_2_Class_A_Sb,Example_2_Class_B_Sb],轴= 0) 

问题是,我不知道与我的Sw和Sb还有什么关系,我完全迷失了.基本上,我需要做的就是从这里开始:

对于给定的示例A和示例B,我如何仅为类A和仅对类b分离群集

解决方案

在回答您的问题之前,我将首先介绍PCA和(F)LDA之间的基本区别.在 PCA 中,您对基础类一无所知,但您假设关于类可分性的信息在于数据的方差.因此,您旋转原始轴(有时称为将所有数据投影到新轴上)的方式是,您的第一个新轴指向最大方差的方向,第二个新轴垂直于第一个方差,并指向最大方差的方向.剩余差异最大,依此类推.这样,PCA转换会导致一个与原始维具有相同维数的(子)空间.比起您只能采用前2个维,拒绝其余维,因此将维数从 k 维减少到仅2维.

LDA的工作方式略有不同.在这种情况下,您可以预先知道数据中有多少个类别,并且可以找到它们的均值和协方差矩阵.费舍尔准则的作用是找到类别之间的平均值最大化的方向,而同时同时将总变异性最小化(总变异性是类别内协方差矩阵的平均值).对于每两个类,只有这样一条线.这就是为什么当您的数据具有 C 类时,LDA最多可以为您提供 C-1 维,而不管原始数据维数如何.在您的情况下,这意味着由于您只有2个A类和B类,您将获得一维投影,即一条直线.这正是您所拥有的图片:原始2d数据投影到一条线上.线的方向是本征问题的解决方案.让我们生成与您的图片相似的数据:

  a = np.random.multivariate_normal((1.5,3),[[0.5,0],[0,.05]],30)b = np.random.multivariate_normal((4,1.5),[[0.5,0],[0,.05]],30)plt.plot(a [:,0],a [:,1],'b.,b [:,0],b [:,1],'r.')mu_a,mu_b = a.mean(轴= 0).reshape(-1,1),b.mean(轴= 0).reshape(-1,1)Sw = np.cov(a.T)+ np.cov(b.T)inv_S = np.linalg.inv(Sw)res = inv_S.dot(mu_a-mu_b)#诀窍#####更一般的解决方案##Sb =(mu_a-mu_b)*(((mu_a-mu_b).T)#eig_vals,eig_vecs = np.linalg.eig(inv_S.dot(Sb))#res = sorted(zip(eig_vals,eig_vecs),reverse = True)[0] [1]#仅取对应于最大(也是唯一一个)特征值的特征向量#res = res/np.linalg.norm(res)plt.plot([-res [0],res [0]],[-res [1],res [1]])#这是解决方案plt.plot(mu_a [0],mu_a [1],'cx')plt.plot(mu_b [0],mu_b [1],'yx')plt.gca().axis('square')#让我们在其上投影数据点r = res.reshape(2,)n2 = np.linalg.norm(r)** 2对于pt中的:prj = r * r.dot(pt)/n2plt.plot([prj [0],pt [0]],[prj [1],pt [1]],'b .:',alpha = 0.2)对于b中的pt:prj = r * r.dot(pt)/n2plt.plot([prj [0],pt [0]],[prj [1],pt [1]],'r .:',alpha = 0.2) 

针对两类问题,使用巧妙的技巧计算得出的投影.您可以在

关于示例"你在你的问题中提到.我相信您需要为每个示例重复该过程,因为它是一组不同的数据点,可能具有不同的分布.另外,请注意,估计均值(mu_a,mu_b)和类别协方差矩阵与生成数据的矩阵略有不同,尤其是对于小样本量.

I have the fisher's linear discriminant that i need to use it to reduce my examples A and B that are high dimensional matrices to simply 2D, that is exactly like LDA, each example has classes A and B, therefore if i was to have a third example they also have classes A and B, fourth, fifth and n examples would always have classes A and B, therefore i would like to separate them in a simple use of fisher's linear discriminant. Im pretty much new to machine learning, so i dont know how to separate my classes, i've been following the formula by eye and coding on the go. From what i was reading, i need to apply a linear transformation to my data so i can find a good threshold for it, but first i'd need to find the maximization function. For such task, i managed to find Sw and Sb, but i don't know how to go from there...

Where i also need to find the maximization function.

That maximization function gives me an eigen value solution:

What i have for each classes are matrices 5x2 of 2 examples. For instance:

Example 1 
Class_A = [
201, 103,
40, 43,
23, 50,
12, 123,
99, 78
]
Class_B = [
   201, 129,
   114, 195,
   180, 90,
   69, 62,
   76, 90
]

Example 2   
Class_A = [
68, 98,
201, 203,
78, 212,
49, 5,
204, 78
]   
Class_B = [
   52, 19,
   220, 219,
   159, 195,
   99, 23,
   46, 50
]

I tried finding Sw for the example above like this:

Example_1_Class_A = np.dot(Example_1_Class_A,  np.transpose(Example_1_Class_A))
Example_1_Class_B = np.dot(Example_1_Class_B,  np.transpose(Example_1_Class_B))

Example_2_Class_A = np.dot(Example_2_Class_A,  np.transpose(Example_2_Class_A))
Example_2_Class_B = np.dot(Example_2_Class_B,  np.transpose(Example_2_Class_B))

Sw = sum([Example_1_Class_A, Example_1_Class_B, Example_2_Class_A, Example_2_Class_B], axis=0)

As for Sb, i tried like this:

Example_1_Class_A_mean = Example_1_Class_A.mean(axis=0)
Example_1_Class_B_mean = Example_1_Class_B.mean(axis=0)
         
Example_2_Class_A_mean = Example_2_Class_A.mean(axis=0)
Example_2_Class_B_mean = Example_2_Class_B.mean(axis=0)
         
Example_1_Class_A_Sb = np.dot(Example_1_Class_A_mean, np.transpose(Example_1_Class_A_mean))
Example_1_Class_B_Sb = np.dot(Example_1_Class_B_mean, np.transpose(Example_1_Class_B_mean))
         
Example_2_Class_A_Sb = np.dot(Example_2_Class_A_mean, np.transpose(Example_2_Class_A_mean))
Example_2_Class_B_Sb = np.dot(Example_2_Class_B_mean, np.transpose(Example_2_Class_B_mean))
Sb = sum([Example_1_Class_A_Sb, Example_1_Class_B_Sb, Example_2_Class_A_Sb, Example_2_Class_B_Sb], axis=0) 

The problem is, i have no idea what else to do with my Sw and Sb, i am completely lost. Basically, what i need to do is get from here to this:

How for given Example A and Example B, do i separate a cluster only for classes As and only for classes b

解决方案

Before answering your question, I will first touch the basic difference between PCA and (F)LDA. In PCA you don't know anything about underlying classes, but you assume that the information about classes separability lies in the variance of data. So you rotate your original axes (sometimes it is called projecting all the data onto new ones) in such way that your first new axis is pointing to the direction of most variance, second one is perpendicular to the first one and pointing to the direction of most residiual variance, and so on. This way a PCA transformation results in a (sub)space of the same dimensionality as the original one. Than you can take only first 2 dimensions, rejecting the rest, hence getting a dimensionality reduction from k dimensions to only 2.

LDA works a bit differently. In this case you know in advance how many classes there are in your data, and you can find their mean and covariance matrices. What Fisher criterion does it finds a direction in which the mean between classes is maximized, while at the same time total variability is minimized (total variability is a mean of within-class covariance matrices). And for each two classes there is only one such line. This is why when your data has C classes, LDA can provide you at most C-1 dimensions, regardless of the original data dimensionality. In your case this means that as you have only 2 classes A and B, you will get a one-dimensional projection, i.e. a line. And this is exactly what you have in your picture: original 2d data is projected on to a line. The direction of the line is the solution of the eigenproblem. Let's generate data that is similar to your picture:

a = np.random.multivariate_normal((1.5, 3), [[0.5, 0], [0, .05]], 30)
b = np.random.multivariate_normal((4, 1.5), [[0.5, 0], [0, .05]], 30)
plt.plot(a[:,0], a[:,1], 'b.', b[:,0], b[:,1], 'r.')
mu_a, mu_b = a.mean(axis=0).reshape(-1,1), b.mean(axis=0).reshape(-1,1)
Sw = np.cov(a.T) + np.cov(b.T)
inv_S = np.linalg.inv(Sw)
res = inv_S.dot(mu_a-mu_b)  # the trick
####
# more general solution
#
# Sb = (mu_a-mu_b)*((mu_a-mu_b).T)
# eig_vals, eig_vecs = np.linalg.eig(inv_S.dot(Sb))
# res = sorted(zip(eig_vals, eig_vecs), reverse=True)[0][1] # take only eigenvec corresponding to largest (and the only one) eigenvalue
# res = res / np.linalg.norm(res)

plt.plot([-res[0], res[0]], [-res[1], res[1]]) # this is the solution
plt.plot(mu_a[0], mu_a[1], 'cx')
plt.plot(mu_b[0], mu_b[1], 'yx')
plt.gca().axis('square')

# let's project data point on it
r = res.reshape(2,)
n2 = np.linalg.norm(r)**2
for pt in a:
    prj = r * r.dot(pt) / n2
    plt.plot([prj[0], pt[0]], [prj[1], pt[1]], 'b.:', alpha=0.2)
for pt in b:
    prj = r * r.dot(pt) / n2
    plt.plot([prj[0], pt[0]], [prj[1], pt[1]], 'r.:', alpha=0.2)

The resulting projection is calculated using a neat trick for two class problem. You can read details on it here in section 1.6.

Regarding the "examples" you mention in your question. I believe you need to repeat the process for each example, as it is a different set of data point probably with different distributions. Also, put attention that estimated mean (mu_a, mu_b) and class covariance matrices would be slightly different from the ones that data was generated with, especially for small sample size.

这篇关于Fisher在Python中的线性判别式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆