如何在R中绘制多个分类变量的平行坐标 [英] How to plot parallel coordinates with multiple categorical variables in R
问题描述
使用GGally包中的 ggparcoord
绘制平行坐标图时,我遇到了困难。由于存在两个分类变量,我想在可视化中显示的内容如下图所示。我发现在 ggparcoord
中, groupColumn
只允许 一个变量 em> 来分组(颜色),当然我可以使用showPoints来标记坐标轴上的值,但我也需要根据分类变量来改变这些标记的形状。有没有其他的软件包可以帮助我实现我的想法?
任何回应将不胜感激!谢谢!
在ggplot2中展开自己的平行坐标图并不难,这将使您可以灵活地定制美学。以下是使用内置钻石
数据框的图示。为获得平行坐标,需要添加一个 ID
列,以便识别数据框的每一行,我们将在ggplot中用作 group
审美。您还需要缩放
数值,以便它们在绘制时都处于相同的垂直缩放比例。然后,您需要在x轴上采用所有需要的列,并将它们重新塑形为长格式。我们使用 tidyverse / dplyr
管道运算符来完成所有这些工作。
即使在限制类别组合,这些线条可能太缠绕在一起,因为这个情节很容易解释,所以请认为这只是一个概念验证。希望你可以创建一些对你的数据更有用的东西。我在下面使用了 color
(用于线条)和 fill
(用于点)美学。您可以根据需要使用 shape
或 linetype
。
库(tidyverse)
pre>
theme_set(theme_classic())
#限制$ b $后从钻石数据框中获取20个随机行b#分为两个级别,分别为切割和颜色
set.seed(2)
ds =菱形%>%
滤镜(%c(D,J ),%c(Good,Premium)%%>%
sample_n(20)
ggplot(ds%>%
mutate( ID = 1:n())%>%#为每一行添加ID
mutate_if(is.numeric,scale)%>%#缩放数字列
gather(key,value,c )重新设置为long格式
aes(key,value,group = ID,color = color,fill = cut))+
geom_line()+
geom_point(size = 2,shape = 21,color =grey50)+
scale_fill_manual(values = c(black,white))
我没有使用
ggparcoords
之前,但唯一看似简单的选项(至少在我第一次尝试使用该函数时)是将两列数据粘贴在一起。下面是一个例子。即使只有四种类别的组合,情节也是令人困惑的,但是如果数据中存在强大的模式,也许可以解释:
library(GGally)
ds $ group = with(ds,paste(cut,color,sep = - ))
ggparcoord(ds,columns = c 1,5:10),groupColumn = 11)+
theme(panel.grid.major.x = element_line(color =grey70))
I am facing a difficulty while plotting a parallel coordinates plot using the
ggparcoord
from the GGally package. As there are two categorical variables, what I want to show in the visualisation is like the image below. I've found that inggparcoord
,groupColumn
is only allowed to a single variable to group (colour) by, and surely I can use showPoints to mark the values on the axes, but i also need to vary the shape of these markers according to the categorical variables. Is there other package that can help me to realise my idea?Any response will be appreciated! Thanks!
解决方案It's not that difficult to roll your own parallel coordinates plot in ggplot2, which will give you the flexibility to customize the aesthetics. Below is an illustration using the built-in
diamonds
data frame.To get parallel coordinates, you need to add an
ID
column so you can identify each row of the data frame, which we'll use as agroup
aesthetic in ggplot. You also need toscale
the numeric values so that they'll all be on the same vertical scale when we plot them. Then you need to take all the columns that you want on the x-axis and reshape them to "long" format. We do all that on the fly below with thetidyverse/dplyr
pipe operator.Even after limiting the number of category combinations, the lines are probably too intertwined for this plot to be easily interpretable, so consider this merely a "proof of concept". Hopefully, you can create something more useful with your data. I've used
colour
(for the lines) andfill
(for the points) aesthetics below. You can useshape
orlinetype
instead, depending on your needs.library(tidyverse) theme_set(theme_classic()) # Get 20 random rows from the diamonds data frame after limiting # to two levels each of cut and color set.seed(2) ds = diamonds %>% filter(color %in% c("D","J"), cut %in% c("Good", "Premium")) %>% sample_n(20) ggplot(ds %>% mutate(ID = 1:n()) %>% # Add ID for each row mutate_if(is.numeric, scale) %>% # Scale numeric columns gather(key, value, c(1,5:10)), # Reshape to "long" format aes(key, value, group=ID, colour=color, fill=cut)) + geom_line() + geom_point(size=2, shape=21, colour="grey50") + scale_fill_manual(values=c("black","white"))
I haven't used
ggparcoords
before, but the only option that seemed straightforward (at least on my first try with the function) was to paste together two columns of data. Below is an example. Even with just four category combinations, the plot is confusing, but maybe it will be interpretable if there are strong patterns in your data:library(GGally) ds$group = with(ds, paste(cut, color, sep="-")) ggparcoord(ds, columns=c(1, 5:10), groupColumn=11) + theme(panel.grid.major.x=element_line(colour="grey70"))
这篇关于如何在R中绘制多个分类变量的平行坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!