使用汇总统计信息在ggplot2中生成箱线图 [英] Producing a boxplot in ggplot2 using summary statistics

查看:385
本文介绍了使用汇总统计信息在ggplot2中生成箱线图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是使用ggplot2生成boxplot的代码我试图修改以适应我的问题:

 库(ggplot2)
set.seed(1)
#创建假想数据
a < - rnorm(10)
b < - rnorm(12)
c< - - - - - (7)
d < - rnorm(15)

#data groups
group < - factor(rep(1:4,c(10,12,7 (b,c,d),组)
名称(mydata) - c(值,组)

#函数用于计算平均值,DS,最大值和最小值
min.mean.sd.max< - function(x){
r <-c(min(x),mean(x)-sd(x),mean(x),mean(x)+ sd(x),max(x))
names( r)< - c(ymin,lower,middle,upper,ymax)
r
}

#ggplot code
p1 < - ggplot(aes(y = value,x = factor(group)),data = mydata)
p1 < - p1 + stat_summary(fun.data = min.mean.sd.max,geom =boxplot)+ ggtitle(Boxplot con media,95%CI,valore min。e max。)+ xlab(Gruppi)+ ylab(Valori)

在我的例子中,我没有实际的数据点,而只有他们的平均值和标准差(数据正态分布)。所以对于这个例子来说:

  mydata.mine = data.frame(mean = c(mean(a),mean (b),平均值(c),平均值(d)),sd = c(sd(a),sd(b),sd(c),sd(d)), 4))

然而,我仍然想制作一个boxplot。我想定义:
ymin = mean - 3 * sd
lower = mean - sd
mean = mean
upper = mean + sd

ymax =意思是+ 3 * sd



但是我不知道如何定义一个函数来访问stat_summary中fun.data的mydata.mine的mean和sd。或者,我可以使用> rnorm 从平均值和sd参数化的正常值中抽取点,但第一个选项在我看来更加优雅和简单。

解决方案

  ggplot(mydata.mine,aes(x = as.factor(group) ))+ 
geom_boxplot(aes(
lower = mean - sd,
upper = mean + sd,
middle = mean,
ymin = mean - 3 * sd ,
ymax = mean + 3 * sd),
stat =identity)


Below is a code for producing a boxplot using ggplot2 I'm trying to modify in order to suit my problem:

library(ggplot2)
set.seed(1)
# create fictitious data
a <- rnorm(10)
b <- rnorm(12)
c <- rnorm(7)
d <- rnorm(15)

# data groups
group <- factor(rep(1:4, c(10, 12, 7, 15)))

# dataframe
mydata <- data.frame(c(a,b,c,d), group)
names(mydata) <- c("value", "group")

# function for computing mean, DS, max and min values
min.mean.sd.max <- function(x) {
  r <- c(min(x), mean(x) - sd(x), mean(x), mean(x) + sd(x), max(x))
  names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
  r
}

# ggplot code
p1 <- ggplot(aes(y = value, x = factor(group)), data = mydata)
p1 <- p1 + stat_summary(fun.data = min.mean.sd.max, geom = "boxplot") + ggtitle("Boxplot con media, 95%CI, valore min. e max.") + xlab("Gruppi") + ylab("Valori")

In my case I do not have the actual data points but rather only their mean and standard deviation (the data are normally distributed). So for this example it will be:

mydata.mine = data.frame(mean = c(mean(a),mean(b),mean(c),mean(d)),sd = c(sd(a),sd(b),sd(c),sd(d)),group = c(1,2,3,4))

However I would still like to produce a boxplot. I thought of defining: ymin = mean - 3*sd lower = mean - sd mean = mean upper = mean + sd
ymax = mean + 3*sd

but I don't know how to define a function that will access mean and sd of mydata.mine from fun.data in stat_summary. Alternatively, I can just use rnorm to draw points from a normal parameterized by the mean and sd I have, but the first option seems to me a bit more elegant and simple.

解决方案

ggplot(mydata.mine, aes(x = as.factor(group))) +
  geom_boxplot(aes(
      lower = mean - sd, 
      upper = mean + sd, 
      middle = mean, 
      ymin = mean - 3*sd, 
      ymax = mean + 3*sd),
    stat = "identity")

这篇关于使用汇总统计信息在ggplot2中生成箱线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆