ggplot geom_bar,stat =“sum” [英] ggplot geom_bar with stat = "sum"

查看:692
本文介绍了ggplot geom_bar,stat =“sum”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想绘制一个条形图,在两个维度上总结一个变量,其中一个将沿着 x 散布,另一个将垂直分布(堆叠)。 p>

我希望以下两条指令可以做同样的事情,但它们不会,只有第二条指令才会给出所需的输出结果(我自己汇总数据的位置)。 p>

我想了解第一种情况发生了什么,如果有方法可以使用 ggplot2 的s内置聚合功能以获得正确的输出。

  library(ggplot2)
library(dplyr)
p1 < - ggplot(diamonds,aes(cut,price,fill = color))+
geom_bar(stat =sum,na.rm = TRUE)



产生这种情节:



  p2 < -  ggplot(钻石%>%
group_by(cut,color)%>%
summarize_at(price,sum,na.rm = T),
aes(cut,price,fill = color))+
geom_bar (stat =identity,na.rm = TRUE)

产生这张图片:





这里是我们栏的顶部应该是的位置,p1 doesn '

 钻石%>%group_by(cut)%>%summarize_at(price,sum ,na.rm = TRUE)
##A tibble:5 x 2
#削减价格
#< ord> < INT>
#1 Fair 7017600
#2良好19275009
#3很好48107623
#4保费63221498
#5理想74513487


解决方案

您可能会误解 stat 选项为 geom_bar 。在这种情况下,因为您希望每个栏中的每个因子的值都加在一起,并根据每种颜色中该总和的总和来着色,您可以将调用简化为 geom_col ,它使用这些值作为条的高度;因此将每个类别内的所有值加和。例如,下面给出所需的输出:

  p1 < -  ggplot(diamonds,aes(cut,price,fill = color))+ 
geom_col(na.rm = TRUE)

另外,如果你想在统计调用中使用 geom_bar ,那么你想使用身份统计:

  p1 < -  ggplot(diamonds,aes(cut,price,fill = color))+ 
geom_bar(stat =identity,na.rm = TRUE)

有关更多信息,请考虑以下主题: https://stackoverflow.com/a/27965637/6722506


I want to plot a bar chart summing a variable along two dimensions, one will be spread along x, and the other will be spread vertically (stacked).

I would expect the two following instructions to do the same, but they don't and only the 2nd one gives the desired output (where I aggregate the data myself).

I'd like to understand what's going on in the first case, and if there's a way to use ggplot2 's built-in aggregation features to get the right output.

library(ggplot2)
library(dplyr)
p1 <- ggplot(diamonds,aes(cut,price,fill=color)) + 
  geom_bar(stat="sum",na.rm=TRUE)

yielding this plot:

p2 <- ggplot(diamonds %>%
                group_by(cut,color) %>%
                summarize_at("price",sum,na.rm=T),
              aes(cut,price,fill=color)) +
  geom_bar(stat="identity",na.rm=TRUE)

yielding this picture:

Here's where the top of our bars should be, p1 doesn't give these values:

diamonds %>% group_by(cut) %>% summarize_at("price",sum,na.rm=TRUE)
# # A tibble: 5 x 2
# cut    price
# <ord>    <int>
# 1      Fair  7017600
# 2      Good 19275009
# 3 Very Good 48107623
# 4   Premium 63221498
# 5     Ideal 74513487

解决方案

You might be misunderstanding the stat option for geom_bar. In this case, since you want the values for each factor to be summed up within each bar, and the bars to be colored based off how much of that total sum is in each color, you can simplify the call to geom_col which uses the values as heights for the bar; and therefore "sums" all the values within each category. For example, the following will give the desired output:

p1 <- ggplot(diamonds,aes(cut,price,fill=color)) + 
        geom_col(na.rm=TRUE)

Alternatively, if you want to use geom_bar with a stat call, then you want to use the "identity" stat:

p1 <- ggplot(diamonds,aes(cut,price,fill=color)) + 
        geom_bar(stat = "identity", na.rm=TRUE)

For more information, consider this thread: https://stackoverflow.com/a/27965637/6722506

这篇关于ggplot geom_bar,stat =“sum”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆