为GGPlot2直方图中的X值以上的任何内容创建一个bin [英] Create a bin for anything above X value in GGPlot2 Histogram
问题描述
使用 ggplot2
,我想创建一个直方图,其中X之上的任何内容都被分组到最终的bin中.例如,如果我的大部分分布都在100到200之间,并且我想按10进行分箱,那么我希望将200以上的任何分箱都存储在"200+"中.
Using ggplot2
, I want to create a histogram where anything above X is grouped into the final bin. For example, if most of my distribution was between 100 and 200, and I wanted to bin by 10, I would want anything above 200 to be binned in "200+".
# create some fake data
id <- sample(1:100000, 10000, rep=T)
visits <- sample(1:1200,10000, rep=T)
#merge to create a dataframe
df <- data.frame(cbind(id,visits))
#plot the data
hist <- ggplot(df, aes(x=visits)) + geom_histogram(binwidth=50)
如何在仍表示我要限制的数据的同时限制X轴?
How can I limit the X axis, while still representing the data I want limit?
推荐答案
也许您正在寻找 geom_histogram
的 breaks
参数:
Perhaps you're looking for the breaks
argument for geom_histogram
:
# create some fake data
id <- sample(1:100000, 10000, rep=T)
visits <- sample(1:1200,10000, rep=T)
#merge to create a dataframe
df <- data.frame(cbind(id,visits))
#plot the data
require(ggplot2)
ggplot(df, aes(x=visits)) +
geom_histogram(breaks=c(seq(0, 200, by=10), max(visits)), position = "identity") +
coord_cartesian(xlim=c(0,210))
这看起来像这样(警告:假数据在这里看起来很差,并且还需要调整轴以匹配中断):
This would look like this (with the caveats that the fake data looks pretty bad here and the axis need to be adjusted as well to match the breaks):
也许其他人可以在这里称重:
Maybe someone else can weigh in here:
# create breaks and labels
brks <- c(seq(0, 200, by=10), max(visits))
lbls <- c(as.character(seq(0, 190, by=10)), "200+", "")
# true
length(brks)==length(lbls)
# hmmm
ggplot(df, aes(x=visits)) +
geom_histogram(breaks=brks, position = "identity") +
coord_cartesian(xlim=c(0,220)) +
scale_x_continuous(labels=lbls)
绘图错误:
Error in scale_labels.continuous(scale) :
Breaks and labels are different lengths
看起来像此,但已在8个月前修复.
Which looks like this but that was fixed 8 months ago.
这篇关于为GGPlot2直方图中的X值以上的任何内容创建一个bin的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!