ggplot2密度直方图,宽度= .5,vline和居中棒位置 [英] ggplot2 density histogram with width=.5, vline and centered bar positions

查看:1472
本文介绍了ggplot2密度直方图,宽度= .5,vline和居中棒位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为一些离散数据提供一个很好的密度(即总和为1)直方图。我尝试了几种方法来做到这一点,但没有一个是完全令人满意的。



生成一些数据:

 #data 
set.seed(-999)
d.test = data.frame(score = round(rnorm(100,1)))
mean.score = mean(d.test [,1])$ ​​b $ b d1 = as.data.frame(prop.table(table(d.test)))

第一个给出了条的正确位置 - 以数字的顶部为中心 - 但错误地放置了 vline() 。这是因为x轴是离散的(因子),所以平均值使用的是层数而不是值。平均值是0.89。

  ggplot(data = d1,aes(x = d.test,y = Freq)) + 
geom_bar(stat =identity,width = .5)+
geom_vline(xintercept = mean.score,color =blue,linetype =dashed)



第二个给出正确的 vline()放置(因为x轴是连续的),但当x轴是连续的时()。

另一个想法是这样的:

  ggplot(d.test,aes(x = score))+ 
stat_bin(binwidth = .5,aes(y = ..density ../ sum(.. density ..)), hjust = - 。5)+
scale_x_continuous(breaks = -2:5)+ #add ticks back
geom_vline(xintercept = mean.score,color =blue,linetype =dashed)

但是这需要调整休息时间,酒吧仍然处于错误的位置(不居中)。不幸的是, hjust 似乎无法正常工作。



我如何获得我想要的一切?


  • 密度总和为1

  • li> vline()在正确的编号

  • width = .5



使用基础图形,可以通过在x轴上绘制两次来解决此问题。这里有一些类似的方法吗?

解决方案

听起来你只是想确保你的X轴值是数字的比例因子

  ggplot(data = d1,aes(x = as.numeric(as.character(d.test)), y = Freq))+ 
geom_bar(stat =identity,width = .5)+
geom_vline(xintercept = mean.score,color =blue,linetype =dashed)+
scale_x_continuous(break = -2:3)

给出


I want a nice density (that sums to 1) histogram for some discrete data. I have tried a couple of ways to do this, but none were entirely satisfactory.

Generate some data:

#data
set.seed(-999)
d.test = data.frame(score = round(rnorm(100,1)))
mean.score = mean(d.test[,1])
d1 = as.data.frame(prop.table(table(d.test)))

The first gives the right placement of bars -- centered on top of the number -- but the wrong placement of vline(). This is because the x-axis is discrete (factor) and so the mean is plotted using the number of levels, not the values. The mean value is .89.

ggplot(data=d1, aes(x=d.test, y=Freq)) +
  geom_bar(stat="identity", width=.5) +
  geom_vline(xintercept=mean.score, color="blue", linetype="dashed")

The second gives the correct vline() placement (because the x-axis is continuous), but wrong placement of bars and the width parameter does not appear to be modifiable when x-axis is continuous (see here). I also tried the size parameter which also has no effect. Ditto for hjust.

ggplot(d.test, aes(x=score)) +
  geom_histogram(aes(y=..count../sum(..count..)), width=.5) +
  geom_vline(xintercept=mean.score, color="blue", linetype="dashed")

Any ideas? My bad idea is to rescale the mean so that it fits with the factor levels and use the first solution. This won't work well in case some of the factor levels are 'missing', e.g. 1, 2, 4 with no factor for 3 because no datapoint had that value. If the mean is 3.5, rescaling this is odd (x-axis is no longer an interval scale).

Another idea is this:

ggplot(d.test, aes(x=score)) +
  stat_bin(binwidth=.5, aes(y= ..density../sum(..density..)), hjust=-.5) +
  scale_x_continuous(breaks = -2:5) + #add ticks back
  geom_vline(xintercept=mean.score, color="blue", linetype="dashed")

But this requires adjusting the breaks, and the bars are still in the wrong positions (not centered). Unfortunately, hjust does not appear to work.

How do I get everything I want?

  • density sums to 1
  • bars centered above values
  • vline() at the correct number
  • width=.5

With base graphics, one could perhaps solve this problem by plotting twice on the x-axis. Is there some similar way here?

解决方案

It sounds like you just want to make sure that your x-axis values are numeric rather than factors

ggplot(data=d1, aes(x=as.numeric(as.character(d.test)), y=Freq)) +
  geom_bar(stat="identity", width=.5) +
  geom_vline(xintercept=mean.score, color="blue", linetype="dashed") + 
  scale_x_continuous(breaks=-2:3)

which gives

这篇关于ggplot2密度直方图,宽度= .5,vline和居中棒位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆