ggplot2 stats =“身份”并且在条形图中堆叠颜色给出“条纹”条形图 [英] ggplot2 stats="identity" and stacking colors in bar plot gives "striped" bar chart

查看:171
本文介绍了ggplot2 stats =“身份”并且在条形图中堆叠颜色给出“条纹”条形图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

继我的
https://dl.dropbox.com/u /1811289/RBootcamp/slides/Tutorial3_DataSort.html



这也是:
http://streaming.stat.iastate.edu/workshops/r-intro/lectures/6-advancedmanipulation.pdf



...只是因为?ddply有点......奇怪(例子与选项的解释不同) - 看起来没有什么可告诉的为简写写作......但我可能错过了一个观点......


Following the answer to my former question I have another question raising :

How, without reshaping the data, to plot a stacked bar plot with different colour depending on another category, at the same time using stats="identity" to sum up values for each stacked area ?

The stats identity works nicely to sum up the values, but for non-stacked columns. In a stacked column, the stacking is somehow "multiplied" or "striped", see picture below.

Some data sample :

element <- rep("apples", 15)
qty <- c(2, 1, 4, 3, 6, 2, 1, 4, 3, 6, 2, 1, 4, 3, 6)
category1 <- c("Red", "Green", "Red", "Green", "Yellow")
category2 <- c("small","big","big","small","small")
d <- data.frame(element=element, qty=qty, category1=category1, category2=category2)

Which gives that table :

id  element  qty category1 category2
1   apples   2       Red     small
2   apples   1     Green       big
3   apples   4       Red       big
4   apples   3     Green     small
5   apples   6    Yellow     small
6   apples   2       Red     small
7   apples   1     Green       big
8   apples   4       Red       big
9   apples   3     Green     small
10  apples   6    Yellow     small
11  apples   2       Red     small
12  apples   1     Green       big
13  apples   4       Red       big
14  apples   3     Green     small
15  apples   6    Yellow     small

Then :
ggplot(d, aes(x=category1, y=qty, fill=category2)) + geom_bar(stat="identity")

But the graph is a bit messy: the colors aren't grouped together !

Why is there this behaviour?

Is there still an option to correctly group the colors without reshaping my data ?

解决方案

I was using for a time this solution but it happened that on my large databases (60 000 entries) the ordered stacked bars ggplot2 was drawing, depending on the zoom level, some white spaces in between the bars. Not sure where this issue comes from - but a wild guess is that I'm stacking too many bars :p .

Aggregating the data with plyr solved the problem:

element <- rep("apples", 15)
qty <- c(2, 1, 4, 3, 6, 2, 1, 4, 3, 6, 2, 1, 4, 3, 6, )
category1 <- c("Red", "Green", "Red", "Green", "Yellow")
category2 <- c("small","big","big","small","small")
d <- data.frame(element=element, qty=qty, category1=category1, category2=category2)

plyr :

d <- ddply(d, .(category1, category2), summarize, qty=sum(qty, na.rm = TRUE))

To explain briefly the contents of this formula:

ddply(1, .(2, 3), summarize, 4=function(6, na.rm = TRUE))

1: dataframe name 2, 3: columns to keep -> the grouping factors to make the calculations by summarize: to create a new dataframe (unlike transform) 4: the name of the calculated column function: the function to apply - here the sum() 6: the column on which to apply the function

4, 5, 6 can be repeated for more calculated fields...

ggplot2 : ggplot(d, aes(x=category1, y=qty, fill=category2)) + geom_bar(stat="identity")

So now, as suggested by Roman Luštrik, data is aggregated according to the graph to be shown.

After applying ddply, indeed, the data is cleaner:

  category1 category2 qty
1     Green       big   3
2     Green     small   9
3       Red       big  12
4       Red     small   6
5    Yellow     small  18

I finally understood how to manage my dataset due this really great source of information: http://jaredknowles.com/r-bootcamp https://dl.dropbox.com/u/1811289/RBootcamp/slides/Tutorial3_DataSort.html

And that one too : http://streaming.stat.iastate.edu/workshops/r-intro/lectures/6-advancedmanipulation.pdf

... Just because ?ddply is a bit... Strange (example differ from the explanation of the options) - looks that there is nothing told for the shorthand writing... But I may have missed a point...

这篇关于ggplot2 stats =“身份”并且在条形图中堆叠颜色给出“条纹”条形图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆