LDA贡献双工图 [英] LDA contribution biplot

查看:44
本文介绍了LDA贡献双工图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为线性判别分析(LDA)创建一个双图.我正在使用从此处

代码:

 库(ggplot2)图书馆(网格)图书馆(MASS)数据(虹膜)iris.lda<-lda(as.factor(Species)〜.,数据=虹膜)#在线性判别式上投影数据iris.lda.values <-预测(iris.lda,iris [,-5])#提取每个预测变量的缩放比例data.lda<-data.frame(varnames = rownames(coef(iris.lda)),coef(iris.lda))#coef(iris.lda)等同于iris.lda $ scalingdata.lda $ length<-with(data.lda,sqrt(LD1 ^ 2 + LD2 ^ 2))#绘制结果p<-qplot(data = data.frame(iris.lda.values $ x),main ="LDA",x = LD1,y = LD2,colour = iris $ Species)+ stat_ellipse(geom ="polygon",alpha = .3,aes(fill = iris $ Species))p<-p + geom_hline(aes(yintercept = 0),size = .2)+ geom_vline(aes(xintercept = 0),size = .2)p<-p +主题(legend.position ="right")p<-p + geom_text(data = data.lda,aes(x = LD1,y = LD2,label = varnames,shape = NULL,linetype = NULL,alpha = length,position ="identity"),大小= 4,调整= .5,hjust = 0,color ="red")p<-p + geom_segment(data = data.lda,aes(x = 0,y = 0,xend = LD1,yend = LD2,shape = NULL,linetype = NULL,alpha = length),箭头=箭头(长度=单位(0.1,毫米")),color ="red")p<-p + coord_flip()打印(p) 

解决方案

这样的事情怎么样?我们必须做一些三角函数来使长度相等.请注意,等式位于图坐标中,因此,如果要实际以相等大小显示,则需要添加 coord_equal .

(我清理了您的绘图代码,因为其中很多都是一团糟.)

  rad<-3#设置行的长度.data.lda $ length<-with(data.lda,sqrt(LD1 ^ 2 + LD2 ^ 2))data.lda $ angle<-atan2(data.lda $ LD1,data.lda $ LD2)data.lda $ x_start<-data.lda $ y_start<-0data.lda $ x_end<-cos(data.lda $ angle)* raddata.lda $ y_end<-sin(data.lda $ angle)* rad#绘制结果ggplot(cbind(iris,iris.lda.values $ x),aes(y = LD1,x = LD2,color =种))+stat_ellipse(aes(fill =种),geom =多边形",alpha = .3)+geom_point()+geom_hline(yintercept = 0,大小= .2)+geom_vline(xintercept = 0,大小= .2)+geom_text(aes(y = y_end,x = x_end,label = varnames,alpha = length),data.lda,大小= 4,对齐= 0.5,对齐= 0,颜色=红色")+geom_spoke(aes(x_start,y_start,angle = angle,alpha = length),data.lda,颜色=红色",半径=弧度,大小= 1)+ggtitle("LDA")+主题(legend.position =正确") 

I am trying to create a biplot for a linear discriminate analysis (LDA). I am using a modified version of code obtained from here https://stats.stackexchange.com/questions/82497/can-the-scaling-values-in-a-linear-discriminant-analysis-lda-be-used-to-plot-e

However, I have 80 variables, making the biplot extremely difficult to read. This is worsened by highly contributing variables, since their arrow lengths are very long and the remaining labels are scrunched up in the middle. So what I am trying to achieve is a biplot where all variable arrows are of equal length, and their relative contributions (scalings) are distinguished by graded colours. So far I have managed to get the graded colours, but I can't find a way to make the arrow lengths the same. From what I understand, geom_text and geom_segment uses the LD1 and LD2 values to determine both the length and direction of the arrows. How can I override the length?

CODE:

library(ggplot2)
library(grid)
library(MASS)
data(iris)


iris.lda <- lda(as.factor(Species)~.,
                data=iris)

#Project data on linear discriminants
iris.lda.values <- predict(iris.lda, iris[,-5])

#Extract scaling for each predictor and
data.lda <- data.frame(varnames=rownames(coef(iris.lda)), coef(iris.lda))

#coef(iris.lda) is equivalent to iris.lda$scaling

data.lda$length <- with(data.lda, sqrt(LD1^2+LD2^2))

#Plot the results
p <- qplot(data=data.frame(iris.lda.values$x),
           main="LDA",
           x=LD1,
           y=LD2,
           colour=iris$Species)+stat_ellipse(geom="polygon", alpha=.3, aes(fill=iris$Species))
p <- p + geom_hline(aes(yintercept=0), size=.2) + geom_vline(aes(xintercept=0), size=.2)
p <- p + theme(legend.position="right")
p <- p + geom_text(data=data.lda,
                   aes(x=LD1, y=LD2,
                       label=varnames, 
                       shape=NULL, linetype=NULL,
                       alpha=length, position="identity"),
                   size = 4, vjust=.5,
                   hjust=0, color="red")
p <- p + geom_segment(data=data.lda,
                      aes(x=0, y=0,
                          xend=LD1, yend=LD2,
                          shape=NULL, linetype=NULL,
                          alpha=length),
                      arrow=arrow(length=unit(0.1,"mm")),
                      color="red")
p <- p + coord_flip()

print(p)

解决方案

How about something like this? We have to do some trigonometry to get the lengths to be equal. Note that the equality is in plot coordinates, so if you want to actually appear in equal size, you'll need to add coord_equal.

(I cleaned up your plotting code, since a lot of it was quite a mess.)

rad <- 3 # This sets the length of your lines.
data.lda$length <- with(data.lda, sqrt(LD1^2+LD2^2))
data.lda$angle <- atan2(data.lda$LD1, data.lda$LD2)
data.lda$x_start <- data.lda$y_start <- 0
data.lda$x_end <- cos(data.lda$angle) * rad
data.lda$y_end <- sin(data.lda$angle) * rad

#Plot the results
ggplot(cbind(iris, iris.lda.values$x),
       aes(y = LD1, x = LD2, colour = Species)) + 
  stat_ellipse(aes(fill = Species), geom = "polygon", alpha = .3) +
  geom_point() +
  geom_hline(yintercept = 0, size = .2) + 
  geom_vline(xintercept = 0, size = .2) +
  geom_text(aes(y = y_end, x = x_end, label = varnames, alpha = length),
            data.lda, size = 4, vjust = .5, hjust = 0, colour = "red") +
  geom_spoke(aes(x_start, y_start, angle = angle, alpha = length), data.lda, 
             color = "red", radius = rad, size = 1) +
  ggtitle("LDA") +
  theme(legend.position = "right")

这篇关于LDA贡献双工图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆