按观察次数过滤ggplot2密度图 [英] Filter ggplot2 density plot by number of observations
问题描述
是否有可能在ggplot2调用中过滤掉观察次数较少的数据子集?
Is it possible to filter out subsets of the data that have small numbers of observations within a ggplot2 call?
例如,采取以下图解: qplot(price,data = diamonds,geom ="density",colour = cut)
For example, take the following plot: qplot(price,data=diamonds,geom="density",colour=cut)
情节有点忙,我希望排除观察值少的 cut
值,即
The plot is a little busy, and I would like the exclude the cut
values with a small number of observations, ie,
> xtabs(~cut,diamonds)
cut
Fair Good Very Good Premium Ideal
1610 4906 12082 13791 21551
cut
因子的 Fair
和 Good
品质.
我想要一个可以适合任意数据集的解决方案,并且如果可能的话,不仅可以根据观察值的阈值进行选择,还可以例如按前3名进行选择.
I'm wanting a solution that can fit an arbitrary data set and if possible be able to select not just by a threshold number of observations, but by top 3 for example.
推荐答案
ggplot(subset(diamonds, cut %in% arrange(count(diamonds, .(cut)), desc(freq))[1:3,]$cut),
aes(price, colour=cut)) +
geom_density() + facet_grid(~cut)
-
count
将每个元素累加到data.frame中. -
arrange
根据指定的列对data.frame进行排序. -
desc
启用逆序排序. - 最后将其剪切的行以
%in%
包含在前3个行中.
count
counts up each elements into data.frame.arrange
orders a data.frame based on the specified column.desc
enables reversed-order sorting.- finally subset the rows whose cut is included in the top 3 by
%in%
.
这篇关于按观察次数过滤ggplot2密度图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!