ggplot在类别内观察比例的图表 [英] ggplot graphing of proportions of observations within categories

查看:110
本文介绍了ggplot在类别内观察比例的图表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我有一个看起来像这样的数据框:

(高,低,高,高,高,低,高,高,高 低,低,高,低,低)
cat2 <-c(1岁,3岁,2岁中龄 3岁,2岁,2岁,1岁,1岁,3岁,3岁,1岁 )
df < - as.data.frame(cbind(cat1,cat2))

在这里的例子中,我想绘制每个年龄组的比例具有值high的比例,以及每个年龄组具有该值的比例 低。更一般地说,我想绘制,类别2的每个值的,属于类别1的每个级别的观察值的百分比。



<下面的代码会产生正确的结果,但只能在绘图之前通过手动计算和分割。有没有一种好的方法可以在ggplot中实时执行此操作?

  library(plyr)
count1 < - count(df,vars = c(cat1,cat2))
count2 < - count(df,cat2)

count1 $ totals< - count2 $ freq
count1 $ pct < - count1 $ freq / count1 $ totals

ggplot(data = count1,aes(x = cat2,y = pct))+
facet_wrap(〜 cat1)+
geom_bar()

这个前面的stackoverflow问题提供了类似的代码,使用下面的代码:

  ggplot(mydataf,aes(x = foo))+ 
geom_bar(aes(y =(..count ..)/ sum ..count ..)))

但是我不想要sum(.. count .. .. ) - 它在分母中给出了所有箱的计数总和;相反,我想要得到每个cat2类别的总数。我还研究了 stat_bin 文档。



我会很感激任何有关如何使这项工作的提示和建议。 / div>

我会理解,如果这不是你真正想要的,但是我发现你的描述很混乱,直到我意识到你只是试图以一种看起来似乎如果有人要求我在每个类别中生成一个比例的图表,我可能会转向一个分段的条形图:



  ggplot(df,aes(x = cat2,fill = cat1))+ 
geom_bar(position =fill)


请注意,y轴按比例记录比例,而不是计数。


I am looking for advice on better ways to plot the proportion of observations in various categories.

I have a dataframe that looks something like this:

cat1 <- c("high", "low", "high", "high", "high", "low", "low", "low", "high", "low", "low")
cat2 <- c("1-young", "3-old", "2-middle-aged", "3-old", "2-middle-aged", "2-middle-aged", "1-young", "1-young", "3-old", "3-old", "1-young")
df <- as.data.frame(cbind(cat1, cat2))

In the example here, I want to plot the proportion of each age group that have the value "high", and the proportion of each age group that have the value "low". More generally, I want to plot, for each value of category 2, the percent of observations that fall into each of the levels of category 1.

The following code produces the right result, but only by manually counting and dividing before plotting. Is there a good way to do this on the fly within ggplot?

library(plyr)
count1 <- count(df, vars=c("cat1", "cat2"))
count2 <- count(df, "cat2")

count1$totals <- count2$freq
count1$pct <- count1$freq / count1$totals

ggplot(data = count1, aes(x=cat2, y=pct))+
facet_wrap(~cat1)+
geom_bar()

This previous stackoverflow question offers something similar, with the following code:

ggplot(mydataf, aes(x = foo)) + 
geom_bar(aes(y = (..count..)/sum(..count..)))

But I do not want "sum(..count..)" - which gives the sum of the count of all the bins - in the denominator; rather, I want the sum of the count of each of the "cat2" categories. I have also studied the stat_bin documentation.

I would be grateful for any tips and suggestions on how to make this work.

解决方案

I will understand if this isn't really what you're looking for, but I found your description of what you wanted very confusing until I realized that you were simply trying to visualize your data in a way that seemed very unnatural to me.

If someone asked me to produce a graph with the proportions within each category, I'd probably turn to a segmented bar chart:

ggplot(df,aes(x = cat2,fill = cat1)) + 
    geom_bar(position = "fill")

Note the y axis records proportions, not counts, as you wanted.

这篇关于ggplot在类别内观察比例的图表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆