将ggplot2和facet_grid一起用于连续变量和分类变量(R) [英] Using ggplot2 and facet_grid for continuous and categorical variables together (R)

查看:236
本文介绍了将ggplot2和facet_grid一起用于连续变量和分类变量(R)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试制作一系列这样的图形:

I am trying to make a series of graphs like this:

我有一些混合的分类数据和连续数据.当只有分类变量或只有连续变量时,我能够制作这一系列图.但是,当同时存在两种类型的变量时,我无法生成这一系列图.

I have some mixed categorical and continuous data. I am able to make this series of graphs when there are only categorical variables or when there are only continuous variables. But I am unable to produce this series of graphs when there are both types of variables.

我在下面创建了一些数据.有没有一种方法可以调试此代码,以便生成一系列图形?

I have created some data below. Is there a way to debug this code so that it produces a series of graphs?

library(ggplot2) 
library(gridExtra)
library(tidyr)

/create some data/

var_1 <- rnorm(100,1,4)
var_2 <- sample( LETTERS[1:2], 100, replace=TRUE, prob=c(0.3, 0.7) )
var_3 <- sample( LETTERS[1:5], 100, replace=TRUE, prob=c(0.2, 0.2,0.2,0.2, 0.1) )
cluster <- sample( LETTERS[1:4], 100, replace=TRUE, prob=c(2.5, 2.5, 2.5, 2.5) )

/put in a frame/

f <- data.frame(var_1, var_2, var_3, cluster)

/convert to factors/

f$var_2 = as.factor(f$var_2)
f$var_3 = as.factor(f$var_3)
f$cluster = as.factor(f$cluster)

/create graphs/

f2 %>% pivot_longer(cols = contains("var"), names_to = "variable") %>% 
    ggplot(aes(x = value, fill = value)) + 
    geom_bar() + geom_density() +
    facet_grid(rows = vars(cluster), 
               cols = vars(variable), 
               scales = "free") + 
    labs(y = "freq", fill = "Var")

当我只有分类变量时,以下代码有效:

When I only have categorical variables, the following code works:

var_2 <- sample( LETTERS[1:2], 100, replace=TRUE, prob=c(0.3, 0.7) )

var_3 <- sample( LETTERS[1:5], 100, replace=TRUE, prob=c(0.2, 0.2,0.2,0.2, 0.1) )

cluster <- sample( LETTERS[1:4], 100, replace=TRUE, prob=c(2.5, 2.5, 2.5, 2.5) )

f <- data.frame(var_2, var_3, cluster)
f$var_2 = as.factor(f$var_2)
f$var_3 = as.factor(f$var_3)
f$cluster = as.factor(f$cluster)

f%>% pivot_longer(cols = contains("var"), names_to = "variable") %>% ggplot(aes(x = value, fill = value)) + geom_bar() + geom_density() +facet_grid(rows = vars(cluster), cols = vars(variable), scales = "free") + labs(y = "freq", fill = "Var")

推荐答案

完全可以在ggplot中完成,但它很hacky.构面实际上是显示同一数据集的额外维度的一种方式.它们并不是用来将不同的图任意缝合在一起的,因此,完全基于ggplot的解决方案需要处理数据和轴标签以产生缝合图的外观.

This is possible to do entirely within ggplot, but it's pretty hacky. Facets are really a way of showing extra dimensions of the same data set. They are not intended to be a way of arbitrarily stitching different plots together, so an entirely ggplot-based solution requires manipulating your data and the axis labels to produce the appearance of stitching plots together.

首先,我们将barplot变量的唯一级别作为字符串:

First, we get the unique levels of the barplot variables as character strings:

levs    <- sort(unique(c(as.character(f$var_2), as.character(f$var_3))))

现在,我们将因子转换为数字:

Now, we convert the factors to numbers:

f$var_2 <- as.numeric(factor(f$var_2, levs)) + ceiling(max(f$var_1)) + 10
f$var_3 <- as.numeric(factor(f$var_3, levs)) + ceiling(max(f$var_1)) + 10

我们现在将构建用于x轴的中断和标签

We will now construct the breaks and labels that we will use for our x axis

breaks  <- c(pretty(range(f$var_1)), sort(unique(c(f$var_2, f$var_3))))
labs    <- c(pretty(range(f$var_1)), levs)

现在,我们可以安全地旋转数据框架了:

Now we can safely pivot our data frame:

f <- pivot_longer(f, cols = c("var_1", "var_2", "var_3")) 

对于我们的绘图,我们将使用数据框中的适当子集组进行密度绘图和条形图绘制.然后,我们使用自由比例进行刻面,并使用我们预先定义的中断和标签来标记x轴:

For our plot, we will use appropriately subsetted groups from the data frame for the density plot and the bar plots. We then facet with free scales and label the x axis with our pre-defined breaks and labels:

ggplot(f, aes(x = value)) +
  geom_density(data = subset(f, name == "var_1")) +
  geom_bar(data = subset(f, name != "var_1"), aes(fill = name)) +
  facet_wrap(cluster~name, ncol = 3, scales = "free") +
  scale_x_continuous(breaks = breaks, labels = labs) +
  scale_fill_manual(values = c("deepskyblue4", "gold"), guide = guide_none())

这篇关于将ggplot2和facet_grid一起用于连续变量和分类变量(R)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆