如何使用ggplot2上的身份控制直方图的顺序 [英] How to control ordering of histogram using identity on ggplot2
问题描述
我是R和ggplot2的新手.我正在使用ISLR包中的College数据集.当我绘制直方图并用aes(fill = Private)填充时,得到以下图. 这种情节极具误导性,因为如果我创建一个私有表,我会得到
I am new to R and ggplot2. I am using the College dataset found in the ISLR package. When I make a histogram plot and fill it with aes(fill=Private), I get the following plot. This plot is highly misleading, because if I create a table of Private I get
否是
212 565
但是从ggplot2创建的直方图可以解释为否"多于是". 参考课程具有下图,正确地描述了是"和否"的数目,但是根据直方图的创建者,它是使用较旧版本的ggplot2生成的.注意,我看到了统计上的差异,但这并没有脱离本文的目的.
but the histogram created from ggplot2 can be interpreted as having more "No" than "Yes". The reference course has the following figure, which correctly depicts the number of "Yes" and "No", but according to the creator of the histogram, it was generated with an older version of ggplot2. Note, I see the difference in statistics, but this does not not take away from the objective of this post.
问题是如何生成此直方图以生成描绘正确的是"和否"的图,如新版本的ggplot2的第二个直方图中所示?
The question is how to generate this histogram to produce a plot that depicts the proper "Yes" and "No" as see in the second histogram with the new version of ggplot2?
我看过其他一些SO帖子,例如更多堆栈,此处,以及 来自ggplot2的 ,但是我还没有看到直方图的答案. 我已经尝试过将dplyr包ans和R order函数一起使用rang,但无济于事.
I have look at several other SO posts such asfor stacks,more stacks,for barplots, here, and from ggplot2, but I have not seen an answer for histograms. I have tried using arrange with the dplyr package ans well as the R order function, but to no avail.
这是我的R代码
library(ISLR)
library(ggplot2)
library(dplyr)
df<-College
ggplot(df,aes(F.Undergrad))+geom_histogram(aes(fill=Private),bins = 50,color='black',alpha=0.5)+theme_bw()
推荐答案
如果我对您的理解正确,那么您要做的就是重新排序df $ Private的因子水平:
If I understood you correctly, all you need to do is reorder the factor levels of df$Private:
df$Private <- relevel(df$Private, "Yes")
ggplot(df, aes(F.Undergrad)) +
geom_histogram(aes(fill = Private),
bins = 50,
color = 'black',
alpha = 0.5) +
theme_bw()
信息基本上是相同的,因为条是堆叠的.如果您不希望这样做,则应遵循@ Tino 的建议,并使用position = "dodge"
The Information is essentially the same, because the bars are STACKED. If you don't want that you should follow the advice from @Tino and use position = "dodge"
ggplot(df, aes(F.Undergrad)) +
geom_histogram(aes(fill = Private),
bins = 50,
color = 'black',
alpha = 0.5,
position = "dodge") +
theme_bw()
与position = identity
:
ggplot(df, aes(F.Undergrad)) +
geom_histogram(aes(fill = Private),
bins = 50,
color = 'black',
alpha = 0.5,
position = "identity") +
theme_bw()
这篇关于如何使用ggplot2上的身份控制直方图的顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!