stat_sum和stat_identity给出奇怪的结果 [英] stat_sum and stat_identity give weird results

查看:526
本文介绍了stat_sum和stat_identity给出奇怪的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码,包括随机生成的演示数据:

  n < -  10 
组<< ; - rep(1:4,n)
mass.means <-c(10,20,15,30)
mass.sigma <-4
score.means< -c(5,5,7,4)
score.sigma < - 3
mass < - as.vector(model.matrix(〜0 + factor(group))%*%mass .means)+
rnorm(n * 4,0,mass.sigma)
得分< - as.vector(model.matrix(〜0 + factor(group))%*%score.means )+
rnorm(n * 4,0,score.sigma)
data < - data.frame(id = 1:(n * 4),group,mass,score)
头(数据)

其中给出:

  id组质量分数
1 1 1 12.643603 5.015746
2 2 2 21.458750 5.590619
3 3 3 15.757938 8.777318
4 4 4 32.658551 6.365853
5 5 1 6.636169 5.885747
6 6 2 13.467437 6.390785

然后我想要在条形图中绘制分数分组的总和:

  plot <-ggplot(data = data,aes(x = group,y = score))+ 
geom_bar(stat =sum)
plot

这给了我:

奇怪的是,使用 stat_identity

  plot <-ggplot(data = data,aes(x = group) ,y =分数))+ 
geom_bar(stat =identity)
plot



这是一个错误?在R上使用ggplot2 1.0.0

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $
$ x86 $ 64
$ linux-gnu
系统x86_64,linux-gnu
状态
主要3
未成年人1.2
年2014年
月10
日31
svn rev 66913
语言R
version.string R版本3.1.2(2014-10-31)
昵称南瓜头盔

或者我做错了什么?

y =分数))+
stat_summary(fun.y =sum,geom =bar,position =identity)
plot

 汇总(score〜group,data = data,FUN = sum)
#group score
#1 1 51.71279
#2 2 58.94611
#3 3 67.52100
#4 4 39.24484

编辑



stat_sum 不起作用,因为它不会返回和。它返回位置观测数量和该位置在该位置的点数百分比。它是为不同的目的而设计的。这些文件说:对散点图上的重叠绘图很有用。

stat_identity (kind of)works because works geom_bar 默认堆叠条形图。与我的解决方案相比,您的每个组合都有很多条,每组只有一个条形。看看这个:

  plot <-ggplot(data = data,aes(x = group,y = score))+ 
geom_bar(stat =identity,color =red)
plot



<请注意以下警告:

 警告信息:
当ymin!= 0时堆叠不正确


I have the following code, including randomly generated demo data:

n <- 10
group <- rep(1:4, n)
mass.means <- c(10, 20, 15, 30)
mass.sigma <- 4
score.means <- c(5, 5, 7, 4)
score.sigma <- 3
mass <- as.vector(model.matrix(~0+factor(group)) %*% mass.means) +
  rnorm(n*4, 0, mass.sigma)
score <- as.vector(model.matrix(~0+factor(group)) %*% score.means) +
  rnorm(n*4, 0, score.sigma)
data <- data.frame(id = 1:(n*4), group, mass, score)
head(data)

Which gives:

  id group      mass    score
1  1     1 12.643603 5.015746
2  2     2 21.458750 5.590619
3  3     3 15.757938 8.777318
4  4     4 32.658551 6.365853
5  5     1  6.636169 5.885747
6  6     2 13.467437 6.390785

And then I want to plot the sum of "score", grouped by "group", in a bar chart:

plot <- ggplot(data = data, aes(x = group, y = score)) + 
  geom_bar(stat="sum") 
plot

This gives me:

Weirdly, using stat_identity seems to give the result I am looking for:

plot <- ggplot(data = data, aes(x = group, y = score)) + 
  geom_bar(stat="identity") 
plot

Is this a bug? Using ggplot2 1.0.0 on R

platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          1.2                         
year           2014                        
month          10                          
day            31                          
svn rev        66913                       
language       R                           
version.string R version 3.1.2 (2014-10-31)
nickname       Pumpkin Helmet    

Or what am I doing wrong?

解决方案

plot <- ggplot(data = data, aes(x = group, y = score)) + 
  stat_summary(fun.y = "sum", geom = "bar", position = "identity")
plot

aggregate(score ~ group, data=data, FUN=sum)
#  group    score
#1     1 51.71279
#2     2 58.94611
#3     3 67.52100
#4     4 39.24484

Edit:

stat_sum does not work, because it doesn't just return the sum. It returns the "number of observations at position" and "percent of points in that panel at that position". It was designed for a different purpose. The docs say " Useful for overplotting on scatterplots."

stat_identity (kind of) works because geom_bar by default stacks the bars. You have many bars on top of each other in contrast to my solution that gives you just one bar per group. Look at this:

plot <- ggplot(data = data, aes(x = group, y = score)) + 
  geom_bar(stat="identity", color = "red") 
plot

Also consider the warning:

Warning message:
Stacking not well defined when ymin != 0

这篇关于stat_sum和stat_identity给出奇怪的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆