R-如何使PCA双线图更具可读性 [英] R - how to make PCA biplot more readable

查看:143
本文介绍了R-如何使PCA双线图更具可读性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组包含23个变量的观察结果.

I have a set of observations with 23 variables.

当我使用prcomp和biplot绘制结果时,我遇到了几个问题:

When I use prcomp and biplot to plot the results I run into several problems:

  1. 实际图只占帧的一半(x <0),但图以0为中心,所以浪费了一半空间

  1. the actual plot only occupies half of the frame (x < 0), but the plot is centered on 0, so half of space is wasted

两个变量清楚地支配了结果,因此所有其他箭头都聚集在一起,我看不懂

two variables clearily dominate the results, so all other arrows are clumped together and I can't read a thing

ad 1.我尝试设置xlim和/或ylim,但是我显然做错了,因为在我这样做时情节被搞砸了

ad 1. I tried setting xlim and/or ylim, but I'm obviously doing something wrong since the plot is all messed up when I do

ad 2.我能以某种方式使箭头标签分开放置些以便我阅读吗?还是我可以只画出没有两个最长箭头(有点放大)的箭头?

ad 2. Can I just somehow make the arrow labels placed more apart so that I can read them? Or maybe I could just plot the arrows without the two longest ones (kind of zoom-in)?

附录:双线图是否可能以与箭头不同的颜色绘制标签?

Addendum: is it possible to have biplot draw the labels in a different color than the arrows?

也:x和y轴不成比例是否有问题(它们的图显示x和y上的长度间隔不同). 我认为这会使箭头之间的天使倾斜,并且这种调整大小不是相似性转换. 是否可以强制Biplot保持1:1的宽高比,或者将绘图绘制为矩形而不是正方形?

Also: is it problematic if the x and y axes are not proportional (they graph shows intervals of different length on x and y). I think this would skew the angels between arrows, and that kind of resizing is not a similarity transformation. Is it possible to force biplot to keep a 1:1 aspect ratio, or to draw the plot as a rectangle and not a square?

推荐答案

我认为您可以使用xlimylim.另外,请查看?biplotexpand参数.不幸的是,您没有提供任何数据,因此让我们获取一些示例数据:

I think you can use xlim and ylim. Also, have a look at the expand argument for ?biplot. Unfortunately, you did not provide any data, so let's take some sample data:

a <- princomp(USArrests)

仅调用biplot的结果如下:

biplot(a)

现在,可以使用xlimylim来放大"以更仔细地查看谋杀"和强奸",还可以使用?biplot中的缩放参数expand:

And now one can "zoom in" to have a closer look at "Murder" and "Rape" using xlim and ylim and also use the scaling argument expand from ?biplot:

biplot(a, expand=10, xlim=c(-0.30, 0.0), ylim=c(-0.1, 0.1))

由于expand因素,请注意上下轴的缩放比例不同.

Please note the different scaling on the top and right axis due to the expand factor.

这是否有助于使您的情节母马可读?

Does this help to make your plot mare readable?

编辑

您还询问标签和箭头是否可以使用不同的颜色. biplot不支持此功能,您可以做的是复制stats:::biplot.default的代码,然后根据需要进行更改(使用plotaxistext时更改col自变量) .

You also asked whether it is possible to have different colors for labels and arrows. biplot does not support this, what you could do is to copy the code of stats:::biplot.default and then change it according to your needs (change col argument when plot, axis and text is used).

或者,您可以将ggplot用作双图.在此处中,实现了一个简单的双图功能.您可以按以下方式更改代码:

Alternatively, you could use ggplot for the biplot. In the post here, a simple biplot function is implemented. You could change the code as follows:

PCbiplot <- function(PC, x="PC1", y="PC2", colors=c('black', 'black', 'red', 'red')) {
    # PC being a prcomp object
    data <- data.frame(obsnames=row.names(PC$x), PC$x)
    plot <- ggplot(data, aes_string(x=x, y=y)) + geom_text(alpha=.4, size=3, aes(label=obsnames), color=colors[1])
    plot <- plot + geom_hline(aes(0), size=.2) + geom_vline(aes(0), size=.2, color=colors[2])
    datapc <- data.frame(varnames=rownames(PC$rotation), PC$rotation)
    mult <- min(
        (max(data[,y]) - min(data[,y])/(max(datapc[,y])-min(datapc[,y]))),
        (max(data[,x]) - min(data[,x])/(max(datapc[,x])-min(datapc[,x])))
        )
    datapc <- transform(datapc,
            v1 = .7 * mult * (get(x)),
            v2 = .7 * mult * (get(y))
            )
    plot <- plot + coord_equal() + geom_text(data=datapc, aes(x=v1, y=v2, label=varnames), size = 5, vjust=1, color=colors[3])
    plot <- plot + geom_segment(data=datapc, aes(x=0, y=0, xend=v1, yend=v2), arrow=arrow(length=unit(0.2,"cm")), alpha=0.75, color=colors[4])
    plot
}

绘制如下:

fit <- prcomp(USArrests, scale=T)
PCbiplot(fit, colors=c("black", "black", "red", "yellow"))

如果您对此功能有所了解,我相信您可以弄清楚如何设置xlimylim值,等等.

If you play around a bit with this function, I am sure you can figure out how to set xlim and ylim values, etc.

这篇关于R-如何使PCA双线图更具可读性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆