在密度分布的顶部绘制中位数 [英] Plot median values on top of a density distribution

查看:105
本文介绍了在密度分布的顶部绘制中位数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用ggplot2 R库在密度分布上绘制某些数据的中值.我想将中间值打印为密度图顶部的文本 .

I'm trying to plot the median values of some data on a density distribution using the ggplot2 R library. I would like to print the median values as text on top of the density plot.

通过示例(使用钻石"默认数据框),您将了解我的意思:

You'll see what I mean with an example (using the "diamonds" default dataframe):

我正在打印三个项目:密度图本身,一条垂直线,显示每个切割的中位数价格,以及带有该值的文本标签.但是,正如您所看到的,中位数价格在"y"轴上重叠(这种美感在geom_text()函数中是必需的).

I'm printing three itmes: the density plot itself, a vertical line showing the median price of each cut, and a text label with that value. But, as you can see, the median prices overlap on the "y" axis (this aesthetic is mandatory in the geom_text() function).

有没有办法为每个中间价格动态分配一个"y"值,以便在不同的高度打印它们?例如,每个切口"的最大密度值.

Is there any way to dynamically assign a "y" value to each median price, so as to print them at different heights? For example, at the maximum density value of each "cut".

到目前为止,我已经拥有了

So far I've got this

# input dataframe
dia <- diamonds

# calculate mean values of each numerical variable:
library(plyr)
dia_me <- ddply(dia, .(cut), numcolwise(median))

ggplot(dia, aes(x=price, y=..density.., color = cut, fill = cut), legend=TRUE) +
  labs(title="diamond price per cut") +
  geom_density(alpha = 0.2) +
  geom_vline(data=dia_me, aes(xintercept=price, colour=cut),
             linetype="dashed", size=0.5) +
  scale_x_log10() +
  geom_text(data = dia_me, aes(label = price, y=1, x=price))

(我正在为geom_text函数中的y美感分配一个常量值,因为它是强制性的)

(I'm assigning a constant value to the y aesthetics in the geom_text function because it's mandatory)

推荐答案

这可能只是一个开始(但是由于颜色,它不太可读).我的想法是在用于绘制中位数线的数据内创建一个"y"位置.这有点武断,但我希望y位置在0.2到1之间(以很好地适合绘图).我是通过sequence-command来完成的.然后,我尝试按中位数价格订购(效果不佳).这是任意的.

This might be a start (but it's not very readable due to the colors). My idea was to create an 'y'-position inside the data used to plot the lines for the medians. It's a bit arbitrary, but I wanted y-positions to be between 0.2 and 1 (to nicely fit on the plot). I did this by the sequence-command. Then I tried to order it (didn't do a lot of good) by the median price; this is arbitrary.

#scatter y-pos over plot
dia_me$y_pos <- seq(0.2,1,length.out=nrow(dia_me))[order(dia_me$price,decreasing = T)]


ggplot(dia, aes(x=price, y=..density.., color = cut, fill = cut), legend=TRUE) +
  labs(title="diamond price per cut") +
  geom_density(alpha = 0.2) +
  geom_vline(data=dia_me, aes(xintercept=price, colour=cut),
             linetype="dashed", size=0.5) +
  scale_x_log10() +
  geom_text(data = dia_me, aes(label = price, y=y_pos, x=price))

这篇关于在密度分布的顶部绘制中位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆