如何使用ggplot2上的身份控制直方图的顺序 [英] How to control ordering of histogram using identity on ggplot2

查看:101
本文介绍了如何使用ggplot2上的身份控制直方图的顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是R和ggplot2的新手.我正在使用ISLR包中的College数据集.当我绘制直方图并用aes(fill = Private)填充时,得到以下图. 这种情节极具误导性,因为如果我创建一个私有表,我会得到

I am new to R and ggplot2. I am using the College dataset found in the ISLR package. When I make a histogram plot and fill it with aes(fill=Private), I get the following plot. This plot is highly misleading, because if I create a table of Private I get

否是

212 565

但是从ggplot2创建的直方图可以解释为否"多于是". 参考课程具有下图,正确地描述了是"和否"的数目,但是根据直方图的创建者,它是使用较旧版本的ggplot2生成的.注意,我看到了统计上的差异,但这并没有脱离本文的目的.

but the histogram created from ggplot2 can be interpreted as having more "No" than "Yes". The reference course has the following figure, which correctly depicts the number of "Yes" and "No", but according to the creator of the histogram, it was generated with an older version of ggplot2. Note, I see the difference in statistics, but this does not not take away from the objective of this post.

问题是如何生成此直方图以生成描绘正确的是"和否"的图,如新版本的ggplot2的第二个直方图中所示?

The question is how to generate this histogram to produce a plot that depicts the proper "Yes" and "No" as see in the second histogram with the new version of ggplot2?

我看过其他一些SO帖子,例如更多堆栈此处,以及 来自ggplot2的 ,但是我还没有看到直方图的答案. 我已经尝试过将dplyr包ans和R order函数一起使用rang,但无济于事.

I have look at several other SO posts such asfor stacks,more stacks,for barplots, here, and from ggplot2, but I have not seen an answer for histograms. I have tried using arrange with the dplyr package ans well as the R order function, but to no avail.

这是我的R代码

library(ISLR)
library(ggplot2)
library(dplyr)
df<-College

ggplot(df,aes(F.Undergrad))+geom_histogram(aes(fill=Private),bins = 50,color='black',alpha=0.5)+theme_bw()

推荐答案

如果我对您的理解正确,那么您要做的就是重新排序df $ Private的因子水平:

If I understood you correctly, all you need to do is reorder the factor levels of df$Private:

df$Private <- relevel(df$Private, "Yes")

ggplot(df, aes(F.Undergrad)) +
  geom_histogram(aes(fill = Private),
                 bins = 50,
                 color = 'black',
                 alpha = 0.5) +
  theme_bw()

信息基本上是相同的,因为条是堆叠的.如果您不希望这样做,则应遵循@ Tino 的建议,并使用position = "dodge"

The Information is essentially the same, because the bars are STACKED. If you don't want that you should follow the advice from @Tino and use position = "dodge"

ggplot(df, aes(F.Undergrad)) +
  geom_histogram(aes(fill = Private),
                 bins = 50,
                 color = 'black',
                 alpha = 0.5,
                 position = "dodge") +
  theme_bw()

position = identity:

ggplot(df, aes(F.Undergrad)) +
  geom_histogram(aes(fill = Private),
                 bins = 50,
                 color = 'black',
                 alpha = 0.5,
                 position = "identity") +
  theme_bw()

这篇关于如何使用ggplot2上的身份控制直方图的顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆