ggplot geom_bar,stat =“sum” [英] ggplot geom_bar with stat = "sum"
问题描述
我想绘制一个条形图,在两个维度上总结一个变量,其中一个将沿着 x
散布,另一个将垂直分布(堆叠)。 p>
我希望以下两条指令可以做同样的事情,但它们不会,只有第二条指令才会给出所需的输出结果(我自己汇总数据的位置)。 p>
我想了解第一种情况发生了什么,如果有方法可以使用 ggplot2
的s内置聚合功能以获得正确的输出。
library(ggplot2)
library(dplyr)
p1 < - ggplot(diamonds,aes(cut,price,fill = color))+
geom_bar(stat =sum,na.rm = TRUE)
产生这种情节:
p2 < - ggplot(钻石%>%
group_by(cut,color)%>%
summarize_at(price,sum,na.rm = T),
aes(cut,price,fill = color))+
geom_bar (stat =identity,na.rm = TRUE)
产生这张图片:
这里是我们栏的顶部应该是的位置,p1 doesn '
钻石%>%group_by(cut)%>%summarize_at(price,sum ,na.rm = TRUE)
##A tibble:5 x 2
#削减价格
#< ord> < INT>
#1 Fair 7017600
#2良好19275009
#3很好48107623
#4保费63221498
#5理想74513487
解决方案您可能会误解
stat
选项为geom_bar
。在这种情况下,因为您希望每个栏中的每个因子的值都加在一起,并根据每种颜色中该总和的总和来着色,您可以将调用简化为geom_col
,它使用这些值作为条的高度;因此将每个类别内的所有值加和。例如,下面给出所需的输出:
p1 < - ggplot(diamonds,aes(cut,price,fill = color))+
geom_col(na.rm = TRUE)
另外,如果你想在统计调用中使用
geom_bar
,那么你想使用身份统计:p1 < - ggplot(diamonds,aes(cut,price,fill = color))+
geom_bar(stat =identity,na.rm = TRUE)
有关更多信息,请考虑以下主题: https://stackoverflow.com/a/27965637/6722506
I want to plot a bar chart summing a variable along two dimensions, one will be spread along
x
, and the other will be spread vertically (stacked).I would expect the two following instructions to do the same, but they don't and only the 2nd one gives the desired output (where I aggregate the data myself).
I'd like to understand what's going on in the first case, and if there's a way to use
ggplot2
's built-in aggregation features to get the right output.library(ggplot2) library(dplyr) p1 <- ggplot(diamonds,aes(cut,price,fill=color)) + geom_bar(stat="sum",na.rm=TRUE)
yielding this plot:
p2 <- ggplot(diamonds %>% group_by(cut,color) %>% summarize_at("price",sum,na.rm=T), aes(cut,price,fill=color)) + geom_bar(stat="identity",na.rm=TRUE)
yielding this picture:
Here's where the top of our bars should be, p1 doesn't give these values:
diamonds %>% group_by(cut) %>% summarize_at("price",sum,na.rm=TRUE) # # A tibble: 5 x 2 # cut price # <ord> <int> # 1 Fair 7017600 # 2 Good 19275009 # 3 Very Good 48107623 # 4 Premium 63221498 # 5 Ideal 74513487
解决方案You might be misunderstanding the
stat
option forgeom_bar
. In this case, since you want the values for each factor to be summed up within each bar, and the bars to be colored based off how much of that total sum is in each color, you can simplify the call togeom_col
which uses the values as heights for the bar; and therefore "sums" all the values within each category. For example, the following will give the desired output:p1 <- ggplot(diamonds,aes(cut,price,fill=color)) + geom_col(na.rm=TRUE)
Alternatively, if you want to use
geom_bar
with a stat call, then you want to use the "identity" stat:p1 <- ggplot(diamonds,aes(cut,price,fill=color)) + geom_bar(stat = "identity", na.rm=TRUE)
For more information, consider this thread: https://stackoverflow.com/a/27965637/6722506
这篇关于ggplot geom_bar,stat =“sum”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!